The Ethics of Economic Sanctions

Economic sanctions involve the politically motivated withdrawal of customary trade or financial relations from a state, organisation or individual. They may be imposed by the United Nations, regional governmental organisations such as the European Union, or by states acting alone.

Although economic sanctions have long been a feature of international relations, the end of the Cold War in the late 20^th century saw significant proliferation of their use. The sanctions made concerted international action possible where previously any action by the West was countered by the U.S.S.R. and vice-versa.This meant that for the first time the United Nations Security Council could impose economic sanctions that, in theory at least, all member states were required to take part in. With this came the possibility to inflict serious damage. Most notable during this period were the comprehensive sanctions imposed on Haiti, the former Yugoslav republics and Iraq. The harms caused to Haiti and the former Yugoslav republics were severe, but the harms suffered by Iraq were the worst ever caused by the use of economic sanctions outside of a war situation. UNICEF, for example, estimated that the economic sanctions imposed on Iraq led to the deaths of 500,000 children aged under five from malnutrition and disease.

Following the devastation caused by economic sanctions in Iraq, a wide variety of organisations began to seriously investigate the possibility of alternative forms of economic sanctions, sanctions not targeted against ‘ordinary people’ but rather targeted against those considered to be morally responsible for the objectionable policies of the target state. The results—‘targeted’ economic sanctions—became the UN’s economic sanctions tool of choice throughout the 2000s. Targeted economic sanctions include measures such as freezing the assets of top government officials or those suspected of financing terrorism, arms embargoes, nuclear sanctions and so on. The harms inflicted by targeted sanctions are, for the most part, much less extensive than those inflicted by previous episodes of economic sanctions which targeted entire populations. Nevertheless, they are not harmless and may still be morally problematic. For example, the arms embargo imposed during the break up of the former Yugoslavia was widely criticised as it did not permit the Bosnian Muslims to acquire the weapons they needed to defend themselves from the genocidal attacks of certain Bosnian-Serb forces.

Despite the obvious and serious moral problems associated with economic sanctions, the ethics of economic sanctions is a topic that has been curiously neglected by philosophers and political theorists. Only a handful of philosophical journal articles and book chapters have ever been published on the subject. This article describes the work that has been carried out.

The Nature of Economic Sanctions
The Ethics of Economic Sanctions
References and Further Reading

1. The Nature of Economic Sanctions

a. Definition

Economic sanctions are the deliberate withdrawal of customary trade or financial relations (Hufbauer et al., 2007), ordered by a state, supra-national or international governmental organisation (the ‘sender’) from any state, sub-state group, organisation or individual (the ‘target’) in response to the political behaviour of that target.

The specific elements of this definition merit some discussion. First, economic sanctions may comprise the withdrawal of customary trade or financial relations in whole or in part. Trade may be restricted in its entirety by refusing all imports and exports. If all imports and exports are refused then the sanctions are known as ‘comprehensive’ sanctions. (Though note that even in the case of comprehensive sanctions humanitarian exemptions are usually made, for example, for food and medicine). In other cases, only some imports or exports are refused—usually commodities like oil and timber—or weapons in the case of arms embargoes. Financial restrictions include measures such as asset freezes, the denial of credit, the denial of banking services, the withdrawal of aid and so on. Again, withdrawal of financial relations may be comprehensive or not.

Second, economic sanctions may be ordered (or ‘imposed’) by a variety of actors. Sanctions can be ‘multilateral’, ordered by the United Nations or regional organisations such as the European Union, or they can be ‘unilateral’, ordered by one state acting alone. The actor ordering economic sanctions is typically known as the ‘sender’ of the sanctions.

In practical terms, contemporary economic sanctions are imposed by following a legal process. For example, economic sanctions mandated by the United Nations Security Council are required to be adopted by all member states under chapter VII of the United Nations Charter. States then pass legislation prohibiting their citizens from entering into trading and/or financial relationships with the target and setting penalties for sanctions-breaking. So although we often talk of sanctions being ‘imposed’ on the target, it should be clear that economic sanctions are actually legal measures imposed by a sender on its own members. It is a sender’s own citizens who are prohibited from trading.

Further, note that this definition excludes measures undertaken by non-state actors, for example, consumer boycotts or boycotts undertaken by companies or religious organisations. Such measures are undeniably worthy of ethical enquiry; however, the ethical concerns they present are sufficiently distinctive to make it sensible to treat them as a separate issue.

Third, states are not the only targets of economic sanctions. Economic sanctions can be, and often are, imposed on sub-state groups. Well known examples from the recent past are the sanctions imposed on Serb-controlled areas of the former Yugoslavia in the 1990s or the ban on trade in conflict diamonds that targeted sub-state rebel groups in parts of Africa. Economic sanctions can also be imposed on companies, organisations and individuals. For example, the UK regularly freezes the UK-held assets of companies, charities or individuals suspected of funding terrorist activities. For this reason it is perfectly possible for a state to sanction its own citizens. Those on the receiving end of economic sanctions are typically known as the ‘target’.

In recent years there has been a shift away from targeting entire states, and towards targeting economic sanctions more narrowly at specific sub-state groups and individuals—those considered responsible for the political behaviour the sanctions are responding to. The reasons for this are two-fold. First, it is expected that such sanctions are more likely to achieve their objectives. Second, it makes it less likely that the harms of sanctions will fall on innocent people. Economic sanctions that are narrowly targeted in this way are known as ‘targeted’ or ‘smart’ sanctions. There is no common term for sanctions imposed on an entire state. This entry suggests ‘collective’.

Fourth, under this definition, economic sanctions are imposed in response to the political behaviour of the target—as distinguished from its economic behaviour. Such a stipulation is common in the economic sanctions literature. For example, Robert Pape distinguishes economic sanctions from what he calls ‘trade wars’:

When the United States threatens China with economic punishment if it does not respect human rights, that is an economic sanction; when punishment is threatened over copyright infringement, that is a trade war (Pape, 1999, 94).

However, not everyone accepts this distinction. David Baldwin, for instance, denies that economic sanctions must be a response to political behaviour. For Baldwin economic sanctions can be a response to any type of behaviour—there is no reason to restrict the definition of economic sanctions to those measures which aim to respond to political behaviour. Thus, contra Pape, Baldwin argues that if the U.S imposes restrictions on trade with China over copyright issues then this is an economic sanction. Further, he argues that in any case there is no clear-cut distinction between the ‘political’ and the ‘economic’ and so there would be no clear-cut basis for making the distinction even if it were warranted (Baldwin, 1985).

In response to Baldwin, it is worth pointing out that in common usage the term ‘economic sanctions’ is actually reserved for a distinctive class of cases that we can roughly describe as being a response to political rather than economic behaviour. Baldwin is right that there is no clear-cut distinction between the political and the economic, but to categorise responses to both as economic sanctions is to ignore the fact that people do actually manage to make the distinction in practice.

Finally, the definition presented here makes no reference to the objective sought by economic sanctions or the mechanism by which they are expected to work. This is an advantage since both the question of the proper objectives of sanctions and the question of how they work, are controversial.

b. Objectives

Economic sanctions theorists tend to conceptualise economic sanctions in one of two ways: as tools of foreign policy or as tools of international law enforcement. As tools of foreign policy, their objective is to achieve foreign policy goals. As tools of international law enforcement, their objective is to enforce international law or international moral norms.

i. Achievement of Foreign Policy Goals

Economic sanctions are most commonly conceptualised as being tools for achieving foreign policy goals. They are considered part of the foreign policy ‘toolkit’ (a range of measures that includes diplomacy, propaganda, covert action, the use of military force, and so forth) that politicians have at their disposal when attempting to influence the behaviour of other states. The foreign policy conception comes in both simple and more sophisticated versions.

In the simple version, the objective of economic sanctions is to change or prevent a target’s ‘objectionable’ policy or behaviour where a policy or behaviour is understood to be ‘objectionable’ if it conflicts with the foreign policy goals of the sender.

However, a frequent criticism of economic sanctions is that—if these are their goals—then economic sanctions don’t work. That is, they usually fail to change or prevent a target’s objectionable policy or behaviour (Nossal, 1989). This concern has led some to ask the question: if economic sanctions don’t work, why do we keep using them? The attempt to answer this question has led some theorists to develop more sophisticated conceptions of economic sanctions.

It has been argued, for instance, that although changing a target’s ‘objectionable’ policy or behaviour is sometimes the objective of economic sanctions, politicians often employ economic sanctions in much more nuanced and subtle ways (Baldwin, 1985, Cortright & Lopez, 2000).

First, Baldwin argues that economic sanctions are often employed with the more limited objective of influencing a target’s ‘beliefs, attitudes, opinions, expectations, emotions and/or propensities to act’ (Baldwin, 1985, 20). No immediate policy or behaviour change is expected—even if, in the long—term, some change is hoped for. In such cases Baldwin argues that economic sanctions are being used symbolically to ‘send a message’. They can signal specific intentions or general foreign policy orientations or they can be used to show support or disapproval for the policies of other states. If the economic sanctions are imposed at some cost to the sending state then this demonstrates the sender’s commitment to its position and strengthens the message being sent. Importantly, even if the objective of an episode of economic sanctions is to ‘send a message’, it is unlikely to feature as the officially stated objective. The message is stronger if the sanctions are framed as demanding a change in the target’s objectionable policy or behaviour—even if it is clear that the economic sanctions alone cannot hope to change this behaviour.

Second, Baldwin argues that economic sanctions may have multiple objectives of which some will be more important to the sender than others. Behaviour change might be a sender’s secondary or even tertiary objective whilst ‘sending a message’ might be the primary objective. Even if the most important objective for the sender is to ‘send a message’, the economic sanctions must be framed as demanding behaviour change if this secondary or tertiary objective is to be met.

Third, economic sanctions may have multiple targets. For example, if economic sanctions are employed as a general deterrent, then there will be many targets of the influence attempt extending well beyond the original recipient of the economic sanctions (Baldwin, 1985).

David Cortright and George A. Lopez have also worked on developing more sophisticated understandings of economic sanctions. Economic sanctions, they argue, can be imposed for purposes that include deterrence, demonstrating resolve, upholding international norms and sending messages of disapproval as well as influencing behaviour change (Cortright & Lopez, 2000).

Finally, Kim Richard Nossal argues that senders might also have retributive punishment as their objective. In other words the intent is to inflict economic harm on a target they regard to have wronged them solely for its own sake and not to achieve any change in behaviour or policy. For Nossal, to be clear, saying a sender has been ‘wronged’ is not to say it has been morally wronged. It is only to say that the target’s actions have displeased the sender. Thus, on Nossal’s account, senders can ‘punish’ agents who—objectively—have done nothing morally wrong—just as a mafia boss might ‘punish’ underlings who have been passing information to the police. Again, it is important to realise that even if the purpose of the economic sanctions is retributive punishment, it is unlikely to be stated as such by the sender for fear of appearing irrational or vindictive (Nossal, 1989).

For all these reasons it would be a mistake to assume from the fact that economic sanctions often fail to achieve their stated objectives that economic sanctions do not work; stated objectives are not always true objectives. The true objectives might be to punish or to send a message. Even when the stated objectives are true objectives they may not be the primary objectives.

Given the above discussion, it appears that changing or preventing objectionable policies or behaviour, ‘sending a message’, and punishment are all possible objectives of economic sanctions.

ii. International Law Enforcement

Alternatively, economic sanctions are sometimes conceptualised as being a tool for enforcing international law or international norms of behaviour. On this conception, the ultimate objective of economic sanctions is understood to be international law enforcement.

For Margaret Doxey, enforcement of the law through the use of economic sanctions might take several forms.

First, enforcement might involve the ending of ongoing violations of international law/norms—the domestic analogy is that of stopping a crime in progress. Doxey’s own example is that of economic sanctions imposed to reverse the illegal invasion of the Falklands Islands by Argentina (Doxey, 1987, 91).

Second, enforcement might require preventing violations of international law from occurring in the first place. The domestic equivalent is that of preventing a known criminal conspiracy from being realised. As Doxey notes, under chapter VII of the UN Charter, given adequate support from its members, the Security Council can designate any situation a threat to peace and then order preventive action to ensure that the threat is not realised (Doxey, 1987, 91).

Third, enforcement might require that economic sanctions are imposed punitively subsequent to violations of international law to deter either the recipient state or others from repeating the violations. Here economic sanctions are ‘a kind of fine for international misbehaviour’ (Doxey, 1987, 92).

The main difference between the law enforcement and the foreign policy conceptions of economic sanctions is that the former claims that the objectives of economic sanctions are purely to enforce international law/international norms of behaviour, whereas the latter claims that the objectives of economic sanctions are determined by a sender’s foreign policy. Of course the two conceptions are not mutually exclusive. A given sanctions episode may align with a sender’s foreign policy goals and work to enforce international law.

This difference between the two conceptions can partially be explained with reference to the focus of the respective theorists’ studies: those employing a foreign policy conception tend to focus on cases where states are the senders of economic sanctions, whereas those employing a law enforcement conception tend to focus on cases where the UN is the sender. Undoubtedly the foreign policy conception fits states better than the UN and the law enforcement conception fits the UN better than states. However, it would be wrong to say that the foreign policy conception applies to states and the law enforcement conception to the UN. States can also act to enforce international law. Likewise, the UN is not immune to the national interests of its more powerful member states.

To summarise then, these are the possible objectives of economic sanctions:

To change or prevent objectionable or unlawful policies or behaviour
To send a message with regards to objectionable or unlawful policies or behaviour
To punish objectionable or unlawful behaviour on deterrent or retributive grounds

c. Mechanisms

Whatever the objectives of economic sanctions, we also need to address the question of how economic sanctions work. Five mechanisms are discussed here: economic pressure, non-economic pressure, direct denial of resources, message sending and punitive mechanisms.

i. Economic Pressure

Theorists of economic sanctions began addressing the question of how economic sanctions worked in the 1970s and 80s and took as their model collective sanctions imposed on states—as this was the predominant mode of sanctioning at the time. They theorised that economic sanctions achieved behaviour/policy change via the imposition of economic pressure. Robert Pape sums this view up well when he states that economic sanctions ‘seek to lower the aggregate economic welfare of a target state by reducing international trade in order to coerce the target government to change its political behaviour’ (Pape, 1997, 94). In elaborating on this mechanism Pape argues that:

Targets of economic sanctions understand they would be better off economically if they conceded to the coercer’s demands, and make their decision based on whether they consider their political objectives to be worth the economic costs. (Pape, 1997, 94)

A similar view to Pape is shared by Hufbauer. They use the following framework to analyse the utility of economic sanctions:

Stripped to the bare bones, the formula for a successful sanctions effort is simple: The costs of defiance borne by the target must be greater than its perceived cost of compliance. That is, the political and economic costs to the target from sanctions must be greater than the political and security costs of complying with the sender’s demands. (Hufbuaer, 2007, 50)

Indeed, the view that economic sanctions work via the imposition of economic pressure is the most widely accepted in the literature. Johann Galtung even calls it ‘the general theory of economic sanctions’ and he elucidates as follows. Focussing on collective economic sanctions, Galtung argues that the objective of economic sanctions is to cause an amount of economic harm sufficient to bring about the ‘political disintegration’ of the state which, in turn, will result in the state being forced to comply with the sender’s demands. For Galtung ‘political disintegration’ is a split in the leadership of a state or a split between the leadership and the people that occurs as people within the state disagree about what to do with regards to the sanctions and the resulting economic crisis. This may involve popular protest and the government being forced to change the objectionable or unlawful policy for fear of losing power. Under what Galtung calls the ‘naïve theory’ of economic sanctions (which he rejects), the more severe the economic pressure, the faster and more significant the political disintegration and the sooner the state will comply. This theory is naïve, Galtung argues, because it does not take into account the fact that sanctions might—at least initially—result in political integration, as the people of the state pull together in the face of adversity. This is especially likely to occur if the target government can muster up the spirit of nationalism. Indeed, ‘rally-round-the-flag’ effects are often cited as a reason for the failure of economic sanctions. Under Galtung’s ‘revised theory’ of economic sanctions, economic pressure results initially in political integration but will eventually lead to political disintegration as economic pressure increases but, he warns, the levels of economic harm required for this might in some cases be exceptionally severe (Galtung, 1967).

With regards to targeted sanctions, it seems possible that they could also sometimes operate via an economic pressure mechanism. For example, asset freezes on top government officials might pressure them into changing the objectionable or unlawful policy/behaviour if the amounts involved were significant enough.

ii. Non-Economic Pressure

Baldwin, however, argues that although economic pressure is one possibility for how economic sanctions might work, it is not the only one. In particular, he argues that economic sanctions do not have to cause economic harm to work. He argues that even if the economic sanctions make barely a dent in a target state’s economy, its government may be moved to act out of a concern to avoid international embarrassment or a reputation as a pariah state. This is particularly likely to occur when targets believe themselves to be members in good standing of international society. Suffering international condemnation might be unacceptable to them. In other cases Baldwin argues that targets might worry that the economic sanctions are a prelude to war. Since a just war must be a last resort, those about to resort to war often impose sanctions first—either in a genuine attempt to reach a non-military resolution or, more cynically, to demonstrate to domestic and international audiences that non-military methods have been attempted and failed—thus making war the last resort. A target might comply with the economic sanctions not because they damage the economy but out of concern to avoid war (Baldwin, 1985). The pressure employed here does not derive from the economic effects of the sanctions. Both collective and targeted economic sanctions may utilise a non-economic pressure mechanism.

iii. Direct Denial of Resources

Economic sanctions employing either the economic or non-economic pressure mechanisms work only indirectly: pressure is applied to targets to force them to change their objectionable/unlawful policies themselves. Thus such sanctions are sometimes referred to as ‘indirect’ sanctions (Gordon, 1999).

However, economic sanctions can also operate directly by denying a target the resources necessary for pursuit of their objectionable/unlawful policy. For example, if the objectionable/unlawful policy of that target state is its militarisation, then economic sanctions might be designed to damage a target state’s economy so thoroughly that it does not have the resources available to build up or maintain its military capacity, or they might involve arms embargoes or nuclear sanctions. Similarly, asset freezes of either state funds or the funds of government officials may operate with a direct mechanism. Freezing Libya’s state funds and the funds of Colonel Gadaffi was intended to make it impossible for him to pay mercenaries during the Arab Spring. Plus the freezing of assets suspected of belonging to terrorist groups is intended to make financing terrorist operations more difficult. Such ‘direct sanctions’ do not apply pressure to the target to change their objectionable/unlawful policy themselves but instead work directly by denying the target the resources it needs to pursue the objectionable/unlawful policy.

iv. Message Sending

Of course, not all economic sanctions aim to change or prevent an objectionable/unlawful policy. Some aim only to ‘send a message’. If the objective of the economic sanctions is simply to ‘send a message’ then the imposition of sanctions in itself should be sufficient to achieve this—causing economic harm should not be necessary. Having said this, there are undoubtedly ways of making the message stronger and causing some economic harm to the target might do this. Of course, as both Baldwin and Doxey note, this is not the only way to strengthen the message. If the sanctions are costly to the sender—because, for instance, they involve putting a stop to valuable exports, this willingness of the sender to bear costs shows how seriously it takes the situation.

v. Punitive Mechanisms

Punishment necessarily involves the infliction of some harm, suffering or otherwise unpleasant consequences on the target, and this is the case whether the objective of the punishment is to deter or whether the punishment is purely retributive in nature. Thus economic sanctions imposed as punishment must either inflict some economic harm or, if a target state (or organisation/individual) is particularly sensitive about its standing in the international community, symbolic sanctions expressing international condemnation might suffice as punishment.

d. Summary

The table below summarises the possible objectives of economic sanctions, together with each objective’s related mechanism(s).

2. The Ethics of Economic Sanctions

At least four moral frameworks have been used to consider the ethics of economic sanctions: just war theory, theories of law enforcement, utilitarianism, and ‘clean hands’.

a. Just War Theory

Of the few writers who have considered the ethics of economic sanctions, the majority point to the analogies between economic sanctions and war and use just war theory as a framework within which to assess their moral permissibility. Some extend the framework only to collective, comprehensive economic sanctions (Gordon, 1999) while others extend it to all types of economic sanctions (Pierce, 1996, Winkler, 1999, Amstutz, 2013).

Just war theory is split into two parts: jus ad bellum, which sets out the principles that must be followed for the resort to war to be just and jus in bello, which sets out the principles that must be followed during war. (Some just war theorists add a third part, jus post bellum, which sets out the principles that must be followed post-war, but since no writers on economic sanctions consider jus post bellum, it has been left out of the following analysis). Those writers who employ just war theory as a moral framework believe that these principles of just war theory can—with minor adjustments—be appropriate as a moral framework for economic sanctions as follows.

There are six principles of jus ad bellum. For the resort to war to be just, all six conditions must be met.

Just Cause: There must be a just cause for war. In mainstream just war theory, just cause is limited to:

the defence of a state from an actual or imminent military attack; and
humanitarian intervention in cases where a state is committing extremely serious human rights violations against its own citizens.

Theorists applying this principle to economic sanctions widely agree that there is just cause to impose economic sanctions if their aim is:

to defend a state from the target’s actual or imminent military attack; or
to stop extremely serious human rights violations being carried out by the target against its own citizens.

Some theorists go further and allow greater latitude for the case of economic sanctions, arguing that there is just cause for economic sanctions in situations of serious injustice that nevertheless fall short of just cause for war (Amstutz, 2013).

However, under the just war framework, there is no just cause for economic sanctions with punitive objectives. Likewise, there is no just cause for economic sanctions imposed preventively, to head off future (but non-imminent) attacks. The theorists in question do not consider economic sanctions designed to ‘send a message’, but since such sanctions do not aim to defend a state from military attack or to stop serious human rights violations but aim merely to change attitudes, beliefs, and so forth, it would seem that there would be no just cause for them on this approach. Therefore, economic sanctions designed to punish or to prevent objectionable/unlawful policies or behaviour would be ruled out as would all sanctions designed to ‘send a message’.

Proportionality: The harm that will foreseeably be caused by the war must not be disproportionate to the good that it is hoped will be achieved. The good consequences to be counted are limited to those specified in the just cause, i.e. putting a stop to any attack or human rights abuses. Any incidental good consequences, such as the kick-starting of an economy, should not be included in the proportionality calculation. However, the harmful consequences of war are not limited to certain types and should all be counted. Further, the calculation must include the harms suffered by all parties to the war and those suffered by neutral states.

For economic sanctions, this principle is met if the good achieved by the sanctions is expected to outweigh the harms of those sanctions. The good to be counted is the ending of the attack, human rights abuses or other injustice. The harms to be counted include not just those suffered by target citizens but also those suffered by sender citizens. It is worth remembering that citizens of sender states can suffer—either directly if their business relies on trade with the target—or indirectly if the economy of the sending state is particularly reliant on trade with the target.

There is nothing essential to the nature of economic sanctions that would prevent the proportionality condition being met.

Right Intention: The decision to go to war must be made with the right intention—the intention to achieve the just cause. The just cause must not be a pretext for some unjust end that is secretly intended. Therefore, economic sanctions must be imposed with the intention of defending a state from attack or stopping/reducing human rights violations. There is nothing essential to the nature of economic sanctions that prevents this condition from being fulfilled. However, Winkler warns that, as a matter of fact, there is a propensity for economic sanctions to be imposed without clear purpose and this means that the requirement of right intention might not be met in many actual cases (Winkler, 1999).

Legitimate Authority: The decision to go to war must be made by a legitimate authority. That is, one which has the moral right to act on behalf of its people and take them into a war. In international law there is a presumption that the governments of all states are legitimate authorities. According to mainstream just war theory, private individuals may not wage war. According to A. J. Coates, war is a legal instrument, and the power to enforce the law is vested in the government on behalf the political community. Thus, private war is an instance of taking the law into your own hands and is a kind of vigilante justice (Coates, 1997).

There is nothing essential to the nature of economic sanctions that would prevent this condition being met. However, if we take the war/economic sanctions analogy seriously, the legitimate authority condition implies that private boycotts of a target state’s products by individuals, companies or other organisations are wrongful—a kind of vigilante justice. This is a conclusion that many would be unwilling to accept.

Last Resort: War must be the last resort. Given the horrendous harms it creates, war must be necessary in order to be just. If other, less harmful, alternatives are available such as economic sanctions or diplomatic measures, then war is not necessary and therefore not just. Under just war theory it is not the case that all the alternative measures must actually be attempted first: if it is obvious they would not work then there is no requirement to make such attempts.

Clearly, if war must be the last resort, it cannot be a requirement that economic sanctions are also a last resort. The equivalent requirement given is that economic sanctions must be the last resort short of war (Winkler, 1999, 145) or that less harmful or less coercive means must be attempted before economic sanctions may be imposed (Amstutz, 2013, 217 ). Again there is nothing essential to the nature of economic sanctions that would prevent them being the least harmful or coercive means available. However, it is worth noting that the harmful effects of economic sanctions have been underestimated in the past and it is not inconceivable that the harms of economic sanctions could exceed those of war in a given case.

Reasonable Chance of Success: There must be a reasonable chance of success. This is to prevent hopeless wars where people die pointlessly.

This condition is particularly pertinent for economic sanctions. Historically, economic sanctions have been accused of ‘never working’ (Nossal, 1989). If this were true then economic sanctions would never be morally permissible under just war theory. However, it is not true. The most comprehensive study of the effectiveness of economic sanctions to date concluded that economic sanctions succeeded (achieved their stated objectives) in one third of cases (Hufbauer et al., 2007). This figure is disputed and is not in any case particularly high. However, it seems fair to say it is not impossible for economic sanctions to work. Therefore this condition could be met in specific cases.

Having addressed the principles of jus ad bellum, it is clear that some economic sanctions may meet the conditions. However, it is still necessary to consider jus in bello. As with jus ad bellum, all the conditions of jus in bello must be met for an individual military action to be morally permissible. However, there is only one principle that is particularly relevant to economic sanctions and that is the principle of discrimination.

Discrimination: The principle of discrimination requires attackers to distinguish between two classes of people in war: combatants and non-combatants, and stipulates their different treatment. According to the principle of discrimination, it is morally permissible to attack combatants at any time. Non-combatants, on the other hand, have immunity from attack, and it is never morally permissible to attack them directly. However, it is sometimes morally permissible to harm non-combatants as an unintentional side effect of an attack against combatants or military property under the doctrine of double effect. The doctrine of double effect acknowledges that one action (for example, bombing a weapons factory) can have two effects: the intended effect (destroying a weapons factory) and a foreseen but unintended side effect (killing non-combatants who live nearby). According to the traditional doctrine of double effect, it is morally permissible to bring about a harmful side effect if it is a foreseen but genuinely unintended consequence of pursuing some good end that is intended—so long as the harm of the side effect is not disproportionate to the intended good end. Michael Walzer, however, significantly revises the traditional doctrine of double effect and it is worth considering his revision here because most of those writing on economic sanctions use Walzer’s version. Walzer adds a further condition to the doctrine. It is not good enough, Walzer argues, that the harm to non-combatants be unintended and not disproportionate, we should expect soldiers to take positive steps to minimise harm to non-combatants, even if this imposes costs to themselves. As he puts it ‘[d]ouble effect is defensible…only when the two [effects] are the product of a double intention: first, that the ‘good’ be achieved; second that the foreseeable evil be reduced as far as possible’ (Walzer, 2006, 155). It is only in this case when the side-effect harms to non-combatants are morally permissible.

In the case of economic sanctions though, who are the equivalent of ‘combatant’ and ‘non-combatant’? Pierce argues that the individuals falling into the class of ‘combatants’ are those who are actually part of the causal chain of events that led to the objectionable or unlawful policy: those who planned and organised it, and those who are carrying it out (Pierce, 1996, 102). Similarly, for Winkler, combatants are those who plan and carry out the objectionable or unlawful policy (Winkler, 1999, 149). For Amtutz, combatants are ‘the government and the elites that support it’ (Amstutz, 2013, 217). Gordon is not clear on who counts as a ‘combatant,’ but she is clear about who she thinks does not: ‘those who are least able to defend themselves, who present the least military threat, who have the least input into policy and military decisions, and who are the most vulnerable’ (Gordon, 1996, 125). On any of these definitions, it is clear that in cases where a target state is pursuing an objectionable/unlawful policy, there will be both ‘combatants’ and ‘non-combatants’ amongst its citizens.

It is generally agreed by writers employing the just war framework that collective sanctions violate the principle of discrimination. Where the collective sanctions involve an indirect economic pressure mechanism, economic harms are intentionally inflicted on the population in the hopes they will protest and force their government to change their objectionable policies. Given that some of the population will count as ‘non-combatants’, this involves the intentional infliction of harm on non-combatants and straightforwardly violates the principle of discrimination.

Where the collective sanctions involve a direct denial of resources mechanism, for example, an attempt to destroy an economy to end a state’s militarisation, the harm to non-combatants is not intended but it is foreseeable and it is still problematic. In the memorable words of Joy Gordon, such sanctions are like a ‘siege writ large’. The sanctions prevent the import of goods into a country just as a surrounding enemy army would a castle or city. Thus sanctions are vulnerable to the same moral criticisms as a siege. Sieges do not discriminate between combatants and non-combatants. In fact in a siege it is usually the non-combatants who suffer the most since increasingly scarce resources will be allocated as a matter of priority to the army or leadership. As Gordon states, in both sieges and in the case of comprehensive collective sanctions ‘the harm is done to those who are least able to defend themselves, who present the least military threat, who have the least input into policy or military decisions, and who are the most vulnerable’ (Gordon, 1999, 125). Sieges do not discriminate between combatants and non-combatants and they do not demonstrate an intention to minimise harms to non-combatants. Therefore, even if the harms are not intended, they cannot be justified under Walzer’s revised doctrine of double effect.

In summary, all writers employing the just war principles as a framework justify its use by drawing an analogy between economic sanctions and war. The just war framework then leads them to conclude that collective sanctions are always impermissible because they violate the just war principle of discrimination. Pierce, Winkler and Amstutz further extend the use of just war principles to targeted economic sanctions and conclude that targeted economic sanctions that do not harm ‘non-combatants’ may be morally permissible because it is at least theoretically possible that they can meet all the just war principles. This would appear to be a neat solution to the issue of the ethics of economic sanctions. However, there are objections to this approach.

i. Objections to the Use of Just War theory: Christiansen and Powers

Christiansen & Powers argue that there are significant differences between the case of war and the case of collective, comprehensive economic sanctions and therefore that the just war principles provide an inadequate framework for the moral analysis of such economic sanctions. In particular they argue that the principle of discrimination does not apply to the case of economic sanctions.

For them, the most important differences between war and economic sanctions are that (1) economic sanctions are imposed as an alternative to war, not as a form of war (sieges during a war being a form of war), and (2) economic sanctions—if carefully designed and monitored—cause less harm than war. They argue that the just war principles—in particular the principle of discrimination—exist to prevent military conflicts heading down the road to ‘total war’, a hellish situation where anything goes. They are an attempt to keep war within some kind of limited civilised control. However, they argue, the intent behind economic sanctions is to avoid war altogether, to stop us even starting upon the road to total war. This being so, there is no reason why the principles governing war—including the principle of discrimination—should also govern economic sanctions (Christiansen & Powers, 1996, 101-109).

Of course that still leaves open the question of what principles should govern economic sanctions, particularly when it concerns questions of inflicting harm on ‘non-combatants’ or, as they put it ‘innocent’ people. Christiansen & Powers argue that in certain cases it is permissible to harm innocent people by means of economic sanctions—even intentionally—so long as their basic rights are not violated. As they state:

“Another model for thinking about sanctions may be found in the distinction between basic rights and lesser rights and enjoyments. This may prove more useful than the just war principle of [discrimination] as a paradigm for economic sanctions. As long as the survival of the population is not put at risk and its health is not severely impaired, aspects of daily life might temporarily be degraded for the sake of restoring the [more basic] rights of others” (Christiansen & Powers, 1996, 107).

Christiansen and Powers go on to argue that there are two further differences between war and economic sanctions that also lend support to abolishing the principle of discrimination. They argue (1) that a population might consent to suffer economic sanctions in which case harming them would not violate their rights, and (2) that a population can in fact bear moral responsibility for the actions of its government, for example, by supporting or not opposing them, and so not qualify as ‘non-combatant’ or innocent. They argue that neither of these considerations are available in the case of war.

It is first worth pointing out that they are surely wrong about these considerations not being available in the case of war. A population suffering severe human rights violations such as ethnic cleansing or genocide might consent to military intervention to help protect them. Likewise, if we can hold a population morally responsible for the actions of their government because they supported them or did not oppose them, then we can do this whether economic sanctions or war are being considered. Nevertheless, their arguments that consent or moral responsibility on the part of the innocent population renders harm to that population morally permissible can be considered on their own merits. Let us consider each in turn.

If an individual genuinely consents to suffer harm then her rights are not violated since she has waived her right to not be harmed in this way. To give an example, it is often argued that the Black population of South Africa consented to the anti-Apartheid sanctions and that this justified the harms they suffered. The consent argument, of course, only applies where the innocent population does in fact consent. This is something that is very difficult to establish. Further, even if it can be shown that the majority of a population consent to the sanctions, it is unlikely that every last person will do so. Hence the consent justification is unlikely to justify all targeting of innocent people.

Christiansen & Powers further argue that we can consider a population morally responsible for its government’s policies if they support them or fail to oppose them—at least where the state in question is a democracy and opposition does not meet with serious penalties. In such cases, they argue, the population is not innocent and so it is morally permissible to target them directly with economic sanctions. They give the example of the White population of South Africa, arguing that the White population shared responsibility for the Apartheid policies of their government and therefore it was morally permissible to target them directly with economic sanctions. However, even if it is accepted that supporting or failing to oppose objectionable/unlawful policies renders one morally responsible and non-innocent, it is very unlikely that every last person in a state is actually supporting—or not opposing—the policies. There is almost always some opposition, however small. Further, one would not normally attribute moral responsibility for such actions to children. They remain innocent. Hence, even if we were to accept the idea that supporting—or even just failing to oppose—one’s government was sufficient for the attribution of moral responsibility—a state would still have some innocent members amongst its population.

Christiansen & Powers conclude by offering their own moral framework which, while clearly influenced by just war theory, has significant differences. The most significant difference is the absence of the principle of discrimination and two replacement principles as follows:

A Commitment to and Prospects for a Political Solution: Sanctions should be pursued as an alternative to war, not as another form of war. They must be part of an abiding commitment to and a feasible strategy for finding a political solution to the problem that justified the imposition of sanctions in the first place.

Humanitarian Proviso: Civilians should be immune from grave and irreversible harm from sanctions, though lesser harms may be imposed on the civilian population. Provision must be made to ensure that fundamental human rights, such as the right to food, medicine, and shelter, are not violated. (Christiansen & Powers, 1996, 114)

ii. Further Objections to the Use of Just War Theory

It has been argued that the revisions made to the just war principles—considered above—do not go far enough. The just war principles are derived from a set of complex and detailed arguments all planted firmly within the context of war. These arguments contain premises that, whilst they may hold true in the case of war, do not always hold true in the case of economic sanctions. Therefore, a much more thoroughgoing revision of just war principles is required if they are to be applied to the case of economic sanctions (Ellis, 2013).

Further, while there are differences between war and collective comprehensive economic sanctions, there are even greater differences between war and targeted economic sanctions. These also call into question the use of a just war framework (Ellis, 2013). For example, why should an arms embargo—which aims to prevent or mitigate a war—be considered under the same principles governing the resort to war or the fighting of it? There is no obvious reason why it should.

b. Theories of Law Enforcement

As we have seen, one way of conceptualising of the economic sanctions is as a tool of international law enforcement: a means to prevent, terminate or punish violations of international law or international moral norms. Therefore, it would seem natural to analyse the ethics of economic sanctions using a framework based on the ethics of law enforcement. Theorists who have done this (Damrosch 1994, Lang 2008) argue that the use of economic sanctions as a tool of law enforcement faces significant moral challenges as follows.

Legitimate Authority: Many argue that only a legitimate authority has the right to enforce the law. An authority is considered legitimate if she (or it) is morally justified in exercising that authority. Opinion is divided on what exactly makes an authority legitimate but two oft-cited necessary conditions are (1) the consent of those subject to the authority (either tacit or explicit), (2) impartiality on the part of the authority; that is, the authority should have no reason to favour the interests of one party over the interests of any other (Rodin, 2002, 176-177).

In the domestic case, it is widely accepted that states (at least democratic states) have the legitimate authority to enforce domestic law against citizens. Therefore agents of the state (police, judges, prison officers) have the legitimate authority to prevent, terminate and punish crime in a way that ordinary citizens do not. If ordinary citizens attempt to prevent, terminate and punish criminals themselves—without any state involvement—this is closer to vigilantism or revenge than law enforcement.

However, in the international case the picture is more complex. Although (at least democratic) states are regarded as having legitimate authority over their own citizens, they are not regarded as having legitimate authority over the citizens of foreign states or over foreign states themselves. First, they lack the consent of foreign citizens or states. Second, they lack impartiality since, in any international dispute, they are likely to prefer their own national interest over the interest of foreign states or citizens. This position on the legitimate authority of states is consistent with the fundamental principle of international law that all sovereign states are equal in the international system.

Different considerations apply when it comes to the United Nations. Is the United Nations a legitimate authority? The UN certainly does claim the authority to interpret international law and to enforce it—at least in the area of peace and security. According to the UN Charter, the Security Council has the authority to require that all UN member states impose economic sanctions on those states or individuals it deems a threat to peace and security. However, many would argue that this authority is illusory since the UN lacks the power to enforce its own judgments on matters of international law. This is because the UN relies on support of member states to achieve law enforcement, and this is not always forthcoming. Further, the permanent members of the Security Council can veto any action the UN proposes. Other critics would argue that whatever de facto authority the UN has, that authority is not legitimate; some question whether the UN really has the consent of member states, others question whether or not the UN, dominated as it is by the five permanent members of the Security Council, is really impartial.

This leads many to conclude that (1) there is no entity in the international system with the legitimate authority to enforce the law, and (2) therefore there is no possibility of morally justified law enforcement at the international level.

Principled Basis: In order to be morally justified on the basis of law enforcement, the sanctions must be a response to violations of genuine international law or international moral norms (Damrosch, 1994). This is not as straightforward as it sounds. International law is a very different matter to domestic law; there is considerable dispute about the moral norms that hold sway internationally and whether or not they even count as real laws. While economic sanctions imposed as a response to the rule against aggression or genocide would pass this test easily, other moral norms are more questionable; to borrow an example from Damrosch, is democratic governance an international moral norm?

Consistency: Law enforcement should be consistent—it is a fundamental principle of justice that like cases are treated alike. It is unfair if one state or individual is prevented from carrying out an activity or punished for it, when another is not (other things being equal). Yet, all our evidence to date shows that economic sanctions are not imposed consistently—they are not regularly and reliably imposed on those who violate international law or international moral norms. With regards to the UN, the national interests of the UN Security Council members are more a guide to the likelihood of sanctions being employed than the fact of a violation (Damrosch, 1994). The situation for states is no different. This should not be surprising, consistency in law enforcement is a product of impartiality and neither the UN nor states are impartial.

Harm to Innocents: Economic sanctions that are used to prevent, terminate or punish breaches of international law sometimes intentionally (or at least foreseeably) harm innocent people—those who bear no moral responsibility for the illegality in question. This is morally problematic because, as a matter of justice, we usually think that the harms of law enforcement and punishment should be directed only at wrongdoers (Lang, 2008; Damrosch, 1994).

Here though it is worth making a distinction between punishment after the fact and law enforcement directed at preventing or terminating violations of law.

In the case of punishment after the fact, it is straightforwardly accepted by most that it is wrong to punish the innocent. This means that collective sanctions—those aimed at the entire population of a state—are straightforwardly morally wrong if judged as punishment. They are a type of collective punishment that punishes the innocent along with the guilty. Targeted sanctions, of course, may be targeted directly at the guilty (or at least those believed to be guilty) and so can avoid this problem.

Lang would extend the prohibition on harming the innocent to all types of law enforcement. However, Damrosch argues that the case of preventing and terminating violations of law is different. She argues that if the law being enforced is important enough (for example, if the sanctions are aimed at preventing genocide) then innocents may be intentionally or foreseeably harmed to achieve this. To be sure, law enforcement measures should be chosen carefully to minimise the suffering of innocent bystanders, but it should not be ruled out altogether (Damrosch, 1994, 67).

c. Utilitarianism

Joy Gordon has used utilitarianism to assess the moral status of comprehensive economic sanctions (Gordon, 1999). According to utilitarianism, an act is right if and only if it maximises utility (i.e. the balance of pleasure over pain or, more generally, of benefit over harm).

According to Gordon, comprehensive economic sanctions are justified on utilitarian grounds in cases where ‘the economic hardship of the civilian population of the target country entails less human harm overall, and less harm to the sanctioned population, than the military aggression or human rights violations the sanctions seek to prevent’ (Gordon, 1999, 133). Let us consider this idea in a bit more detail.

Imagine a sender is indeed considering imposing economic sanctions on a state that is engaged in military aggression or human rights violations. According to utilitarianism, the sender would be permitted (indeed, required) to impose economic sanctions if the sanctions were expected to result in less harm overall than any other means of ending the aggression/human rights violations (travel bans, military intervention and so forth) or, indeed, “doing nothing” and letting the aggression/violations continue unchecked. Note that in making this utilitarian calculation, harms to sender citizens, target citizens and all other individuals affected are to be counted and weighed equally.

In order to determine whether economic sanctions are expected to result in the least harm in this case, we need to address two questions: (1) how harmful do we expect the economic sanctions to be? and (2) what is the probability they will succeed in ending the human rights abuses?

(1) It is fair to say that, in general, economic sanctions are less harmful and destructive in their effects than military attack but more harmful and destructive than diplomatic measures (such as travel bans or withdrawing staff from embassies). However, there will be exceptions. For example, a targeted military strike might result in a lot less harm than collective, comprehensive sanctions. It should not always be assumed that economic sanctions are less harmful than military action. Senders should also take care to consider the full range of economic sanctions available to them: targeted sanctions may cause much less harm than collective sanctions but be equally effective.

(2) We also need to consider whether the economic sanctions will be successful at ending the human rights abuses. It is important to take this into account. If economic sanctions do not work, then the target citizens continue to suffer the human rights abuses whilst also suffering the economic sanctions. It would have been better to not have imposed the sanctions at all. From a utilitarian point of view, it is wrong to impose economic sanctions if it is expected that they will fail or that they are very likely to fail. Since economic sanctions often have quite a low probability of success then, at least in the case of more harmful comprehensive sanctions, they will often be ruled out on utilitarian grounds. Of course, this would need to be considered on a case by case basis. Gordon finds the ineffectiveness of economic sanctions particularly troubling, and claims it is unlikely any particular episode of comprehensive sanctions would be justified on utilitarian grounds (Gordon, 1999, 137).

Finally, senders also need to remember that economic sanctions—especially those using an economic pressure mechanism—often take years to work. Military intervention might be a faster way of ending the human rights abuses and consequently be the action that results in the least harm overall. In such a case, utilitarianism would demand military intervention, not economic sanctions.

d. “Clean Hands”

Conventionally, economic sanctions are conceptualised as being measures designed to change the objectionable/unlawful behaviour of targets (or perhaps to punish it). However, Noam Zohar, drawing on Jewish theological tradition, argues in favour of an alternative way of thinking about economic sanctions—that of economic sanctions as a method of ‘preserving clean hands’.

Under a ‘clean hands’ sanctioning policy, the objective of the economic sanctions is not to change a target’s behaviour or to punish it but rather to avoid complicity in that behaviour. Zohar argues, for example, that if one state sells weapons—or allows weapons to be sold by its citizens—to a second state where it knows or suspects those weapons will be used to commit human rights violations, then it facilitates those violations and is thus morally responsibility for them as an accomplice. Hence states have a duty to impose arms embargoes (a type of economic sanction) on targets that they suspect would use those arms to commit human rights violations. Furthermore, clean hands sanctions are not restricted to arms embargoes; Zohar argues that embargoes would be required on all goods which would facilitate wrongdoing. For example, he argues that there is a requirement to prevent oil exports to a state whose military is engaged in ethnic cleansing as oil would be necessary to fuel tanks, planes and so on. (Zohar, 1993). Zohar’s analysis is restricted to cases where a state is violating the human rights of its own citizens. However, it can easily be extended to cover cases where states are engaged in other types of wrongdoing, for example, pursuing aggressive war.

Zohar’s idea is interesting because to date the moral analysis of economic sanctions has almost exclusively assumed that economic sanctions are a prima facie wrong and that their use requires moral justification. However, under a clean hands conception of economic sanctions the imposition of sanctions is, by contrast, a moral duty—a duty derived from the duty not to be complicit in human rights violations. Employing the clean hands conception of economic sanctions thus shifts the burden of moral justification from those who would impose sanctions to those who would not. The clean hands conception therefore appears to be a valuable tool for those who would impose economic sanctions in response to international wrongdoing. However, attractive as it may be, there are some difficulties with Zohar’s view (some of which he acknowledges himself).

The first relates to Zohar’s conception of complicity in wrongdoing. For Zohar, mere suspicion that the goods in question will be used for activities that violate human rights is sufficient to deem the exporting state complicit in the violations. This view of complicity is controversial. Many would argue that an accomplice to a crime must intend—or at least know—that the goods they are supplying will be used to commit a crime. To designate a person an accomplice on the grounds of mere suspicion, they argue, would appear to make one responsible for the crimes of other people, people over whom one has no control. If it cannot be said that the exporting state is complicit in cases of suspicion, then it cannot be said that it has a duty to sanction in these cases (at least not on the grounds that sanctioning would avoid complicity in wrongdoing). This view of complicity would restrict Zohar’s clean hands argument to cases where the exporting state intends or knows the goods supplied will be used in human rights violations.

Second, there is the question of which goods can be said to facilitate human rights violations. It seems obvious that weapons directly facilitate all kinds of human rights violations. But what about other goods? What about food for example? Without food, no military (or any other organisation) can operate. Does this mean that in cases where a state is engaged in human rights violations, there is a duty to sanction food exports? The clean hands argument would seem to suggest there is. For many, however, this conclusion would be too extreme.

Another serious problem relates to the question of dual-use goods. These are goods which have both military and civilian uses. To borrow Zohar’s example, oil may be used to fuel a campaign of ethnic cleansing but it may also be used to heat homes in winter. In cases of multi-lateral sanctions, such as those imposed by the UN, a ban on oil exports could cause civilians to freeze to death (as—in theory at least—no state would sell them oil). Should the UN sanction oil to avoid complicity in ethnic cleansing or should it continue to allow the export of oil to avoid civilians freezing to death? Zohar tentatively suggests that in such cases there may be a duty to engage in a limited military action designed to ensure oil exports are used purely by civilians. This would allow the exporting states to avoid complicity in the ethnic cleansing without causing civilians to freeze to death. He suggests this role could be taken on by the United Nations.

The problem with this suggestion is twofold. First, the limited military action suggested may simply not be possible. The importing state may simply take the oil by force from the UN. Second, even if limited military action were possible, a positive argument would still be required for this course of action. The fact that it resolves the dilemma is not by itself a positive argument in favour given that other methods may also resolve the dilemma, for example, full scale military intervention, and so forth.

e. Summary

Economic sanctions raise serious moral questions that have largely been ignored by philosophers and political theorists. The existing literature on the ethics of economic sanctions, whilst important and illuminating, barely scratches the surface of the subject. Further research in this area is required. There is scope to consider the four frameworks outlined above in more detail and to critique their application and/or the conclusions reached under each of them. There is also scope to develop entirely new frameworks for the moral assessment of economic sanctions.

3. References and Further Reading

a. On the Nature of Economic Sanctions

Andreas, Peter, ‘Criminalizing Consequences of Sanctions: Embargo Busting and its Legacy’, International Studies Quarterly, 49, 2005
Baldwin, David, ‘The Sanctions Debate and the Logic of Choice’, International Security, 24, 1999/2000
Baldwin, David and Pape, Robert ‘Evaluating Economic Sanctions’, International Security, 23, 1998
Baldwin, David, Economic Statecraft, (Princeton: Princeton University Press, 1985)
Cortright, David & Lopez, George A., Smart Sanctions: Targeting Economic Statecraft, (Lanham Md: Rowman & Littlefield, 2002)
Cortright, David & Lopez, George A., The Sanctions Decade: Assessing UN Strategies in the 1990s, (London: Lynne Rienner Publishers, Inc., 2000)
Crawford, Neta C. & Klotz, Audie, How Sanctions Work: Lessons from South Africa (Basingstoke: MacMillan Press Ltd, 1999)
Doxey, Margaret, International Sanctions in Contemporary Perspective (Basingstoke: MacMillan, 1987)
Elliot, Kimberly Ann, ‘The Sanctions Glass: Half Full or Completely Empty?’, International Security, Vol. 23, No.1, 1998
Galtung, John, ‘On the Effects of International Economic Sanctions: With Examples from the Case of Rhodesia’, World Politics, Vol. 19, Issue 3, 1967
Gordon, Joy, Invisible War, (Harvard University Press, 2010)
Hufbauer, Gary, Jeffrey Schott, and Kimberly Ann Elliott, Economic Sanctions Reconsidered, 3rd edition, (Washington, Peterson Institute for International Economics, 2007)
Pape, Robert A., ‘Why Economic Sanctions Do Not Work’, International Security, Vol. 22, No. 2, 1997
Pape, Robert, ‘Why Economic Sanctions Still Do Not Work’, International Security, Vol. 23, No. 1, 1998
Peksen, Dursun and Drury, Cooper A., ‘Coercive or Corrosive?: The Negative Impact of Economic Sanctions on Democracy’, International Interactions: Empirical and Theoretical Research in International Relations, 36, 2010
Peksen, Dursun and Drury, Cooper A., ‘Economic Sanctions and Political Repression: Assessing the Impact of Coercive Diplomacy on Political Freedoms’, Human Rights Review, 10, 2009
Wood, Reed M., ‘A Hand Upon the Throat of the Nation: Economic Sanctions and State Repression, 1976–2001’, International Studies Quarterly, 52, 2008

b. On the Ethics of Economic Sanctions

Amstutz, Mark, International Ethics: Concepts, Theories, and Cases in Global Politics, 4^th edition, (Lanham: Rowman & Littlefield Publishers Inc), 2013, Chapter 10
Christiansen, Drew & Powers, Gerard, F. ‘Economic Sanctions and Just War Doctrine’, in Cortright and Lopez (eds.), Economic Sanctions: Panacea or Peacebuilding? (Oxford: Westview Press, 1995)
Clawson, Patrick, ‘Sanctions as Punishment, Enforcement and Prelude to Further Action’, Ethics and International Affairs, 7, 1999
Damrosch, Lori Fisler, ‘The Collective Enforcement of International Norms through Economic Sanctions’, Ethics and International Affairs, 8, 1994
Ellis, Elizabeth, ‘The Ethics of Economic Sanctions’, PhD Thesis, University of Edinburgh, Edinburgh, 2013
Gordon, Joy, ‘Smart Sanctions Revisited’, Ethics and International Affairs, 25, 2011
Gordon, Joy, ‘A Peaceful, Silent, Deadly Remedy: The Ethics of Economic Sanctions’, Ethics and International Affairs, 13, 1999
Lang, Anthony F., Punishment, Justice and International Relations: Ethics and Order after the Cold War, (London: Routledge, 2008), Chapter 5
Nossal, Kim Richard, ‘International Sanctions as International Punishment’, International Organization, Vol. 43, No. 2, 1989
Pierce, Albert C, ‘Just War Principles and Economic Sanctions’, Ethics and International Affairs, 10, 1996
Winkler, Adam, ‘Just Sanctions’, Human Rights Quarterly, 21, 1999
Zohar, Noam, ‘Boycott, Crime and Sin: Ethical and Tulmudic Responses to Injustice Abroad’, Ethics and International Affairs, Vol. 7, 1993

c. Other Referenced Works

Coates, A.J, The Ethics of War (Manchester: Manchester University Press, 1997)
Rodin, David, War and Self Defence, (Oxford: Oxford University Press, 2002)
Walzer, Michael, Just and Unjust Wars: A Moral Argument with Historical Illustrations, 4^th edition (New York: Basic Books, 2006)

Author Information

Elizabeth Ellis
Email: E.A.Ellis@leeds.ac.uk
University of Leeds
United Kingdom

Presocratics

Presocratic philosophers are the Western thinkers preceding Socrates (c. 469-c. 399 B.C.E.) but including some thinkers who were roughly contemporary with Socrates, such as Protagoras (c. 490-c. 420 B.C.E.). The application of the term “philosophy” to the Presocratics is somewhat anachronistic, but is certainly different from how many people currently think of philosophy. The Presocratics were interested in a wide variety of topics, especially in what we now think of as natural science rather than philosophy. These early thinkers often sought naturalistic explanations and causes for physical phenomena. For example, the earliest group of Presocratics, the Milesians, each proposed some material element ¾ water, air, the “boundless,” as the basic stuff either forming the foundation of, or constituting, everything in the cosmos.

Such an emphasis on physical explanations marked a break with more traditional ways of thinking that indicated the gods as primary causes. The Presocratics, in most cases, did not entirely abandon theistic or religious notions, but they characteristically posed challenges to traditional ways of thinking. Xenophanes of Colophon, for example, thought that most concepts of the gods were superficial, since they often amount to mere anthropomorphizing. Heraclitus understood sets of contraries, such as day-night, winter-summer, and war-peace to be gods (or God), while Protagoras claimed not to be able to know whether or not the gods exist. The foundation of Presocratic thought is the preference and esteem given to rational thought and argumentation over mythologizing. This movement towards rationality and argumentation would pave the way for the course Western thought.

On “Presocratic” and the Sources
1. The Sources
The Milesians
Xenophanes
Pythagoras and Pythagoreanism
Heraclitus
Eleatic Philosophy
Philosophies of Mixture
1. Anaxagoras
2. Empedocles
  1. Macrocosm
  2. Microcosm
The Atomists
Diogenes of Apollonia
The Sophists and Anonymous Sophistic Texts
Conclusion
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. On “Presocratic” and the Sources

Difficulties are perhaps inevitable any time we lump a group of variegated thinkers under one name. The so-called “Presocratic philosophers” were a group of different thinkers hailing from different places at different times, many who of whom thought about different things. To call them all “Presocratic” thinkers can seem too sweepingly broad and inaccurate, or insensitive to the differences between each of the thinkers. Even, and perhaps especially, where there are similarities, “Presocratic” seems unsatisfactory. For, where the thought of different people deals with similar ideas, a specific name seems appropriate for that group of people. This happens in Presocratic philosophy (for example, the Milesians), but those specific names are treated merely as species of the larger genus that we call “Presocratic philosophy.”

There are also historical difficulties with the term. For example, the atomist Democritus—traditionally considered to be a Presocratic—is supposed to have been approximately contemporary with Socrates. Continuing on with the use of the term then, should be a tentative and careful endeavor. Whatever the case, these thinkers set Western philosophy on its path.

a. The Sources

We have no complete writings from any of the Presocratics, and from some, nothing at all. Our sources, then, are primarily twofold: fragments and testimonia. The fragments are purported bits of the thinkers’ actual words. These might be fragments of books that they wrote, or simply recorded sayings. In any case, there are no surviving complete works from the Presocratics. Moreover, it is important to remember that there are no original compositions—of any length or degree of completeness—available. Neither, for that matter, are any originals available from Plato or Aristotle. In the pre-printing press days, scribes copied whatever editions of books and other written works they had available to them. We have texts that have been copied many times over. This means that, even with the fragments, we can never be sure whether or not the words we are reading correspond exactly to the original ideas that the Presocratics expressed.

The ancient testimonies come to us from several sources, each having its own agenda and degree of reliability. Both Plato and Aristotle explicitly name many of the Presocratics, sometimes discussing their supposed ideas at length. We must recognize that both Plato and Aristotle almost certainly treated Presocratic thought in light of their own respective philosophical agendas. Therefore, the information we get from them about the Presocratics is likely skewed and sometimes arrantly false. Plato wrote philosophical-literary dialogues, and likely needed to represent the Presocratics in his own peculiar ways to meet the needs of the dialogues. Aristotle, who wrote in the treatise style to which we are more accustomed today, also references the Presocratics in the context of his own philosophy. Aristotle would set out to write on a particular topic (for example, physics), and would survey the ideas of his predecessors on that same topic. In doing so, he at times agreed with their positions, and often disagreed with them. We have to beware, especially where Aristotle disagreed with his predecessors, of a possible (and possibly intentional) straw-man technique that Aristotle might have employed to advance his own position. Thus, while the accounts of Plato and Aristotle can be useful, we should read them cautiously.

2. The Milesians

While it might be inaccurate to call them a school of thinkers, the Milesian philosophers do have connections that are not merely geographical. Hailing from Miletus in Ionia (modern day Turkey), Thales, Anaximander, and Anaximenes each broke with the poetic and mythological tradition handed down by Hesiod and Homer. With what little we know about the Milesians, we do not consider them philosophers in the same way that we consider Plato, Aristotle, and their successors philosophers. Much of what we know about them suggests that they were protoscientists, concerned with cosmogony, which wasthe generation of the cosmos; and cosmology, the study of or inquiry into the nature of the cosmos. Their cosmogonies and cosmologies are oriented primarily by naturalistic explanations, descriptions, and conjectures, rather than traditional mythology. In other words, the Milesians ostensibly sought to explain the cosmos on its own terms, rather than pointing to the gods as the causes or progenitors of all natural phenomena.

The geographical placement of Miletus is noteworthy. It is not unlikely that someone like Thales, for example, travelled to Egypt and perhaps to Babylon. Indeed, there is great evidence to suggest that the Babylonians, in some fashion or another, contributed significantly to ancient Greek knowledge of astronomy and mathematics. This is important to keep in mind when considering Presocratic discoveries in astronomy, mathematics, and other fields. There is scant evidence to suggest that this or that Presocratic thinker was the sole inventor or discoverer in any particular scientific finding or field.

a. Thales

Typically considered to be the first philosopher in the history of Western philosophy, Thales (c. 624-c. 545 B.C.E.) is a figure surrounded by legend and anecdotes. The historian Herodotus says that Thales proposed a single congress for Ionia, effectively centralizing the governmental powers, and making Ionia a single state (Graham 23). In a Lydian military campaign, he is supposed to have diverted the Halys river so that the Lydian military could safely cross in the absence of bridges (Graham 25). Aristotle relays another story, claiming to show us how Thales defended himself and philosophers against claim that philosophers are useless. Through astronomy, Thales was purportedly able to predict a good olive harvest for a particular year. That winter, he bid on the region’s olive presses, and since no one bid against him (they apparently found his prediction incredible), he put down only a small sum. “When harvest time came and everyone needed the presses right away, he charged whatever he wished and made a good deal of money—thus demonstrating that it is easy for philosophers to get rich if they wish, but that is not what they care about” (Graham 25). Plato relates the humorous story that Thales fell into a well while stargazing. “A Thracian servant girl with a sense of humor…made fun of him for being so eager to find out what was in the sky that he was not aware of what was in front of him right at his feet” (Graham 25). Thus, this might be the first anecdote of the impractical and incompetent philosopher who proves himself practically competent, but ultimately unconcerned with worldly affairs.

While we have no way of knowing whether or not any of these stories square with the facts, they paint a picture of Thales as a practical and theoretical wise man—a picture that attracted the eyes of most ancient authorities. He is said to have predicted a solar eclipse in 585, helping the Ionians in battle, since he informed them of the coming darkness, and the enemy was, literally, left in the dark (Graham 23). It is also reported that Thales was highly influential in his work in geometry, if not being entirely responsible for introducing it to Greece from Egypt. Indeed, he is supposed to have discovered that two triangles sharing a side and having equal adjacent angles are congruent (Graham 35), that a circle is bisected by its diameter (Graham 33), and that angles at the base of two isosceles triangles are equal (Graham 35).

Perhaps because of Thales, Milesian philosophy has running through it a taste for the first principles or beginnings of the cosmos. Thales supposed the principle or source (arche) of all things to be water. Aristotle guesses some reasons why Thales might have believed this (Graham 29). First, all things seem to derive nourishment from moisture. Next, heat seems to come from or carry with it some sort of moisture. Finally, the seeds of all things have a moist nature, and water is the source of growth for many moist and living things. Some assert that Thales held water to be a component of all things, but there is no evidence in the testimony for this interpretation. It is much more likely, rather, that Thales held water to be a primal source for all things—perhaps the sine qua non of the world.

It is unclear just how far we are to take Thales here, or precisely how, or if, water plays a role in every cosmological phenomenon. While Thales did turn to naturalistic explanations of the cosmos, he did not abandon belief in the gods. He was supposed to have thought that “all things are full of gods,” and that water is pervaded by a divine power, which also moves the water (Graham 35). If all things either are water, or can ultimately be traced in some way to water, water itself becomes divine—it is the life of the universe, and thus all things are in some way divine. Moreover, if water is more or less connected with some particular thing in the cosmos, then it would stand to reason that some things are more or less divine. As Aetius testifies, “Thales said that God is the mind of the world, and the totality is at once animate and full of deities. And a divine power pervades the elemental moisture and moves it” (Graham 35)). Thales, then, did not abandon theology in favor of naturalism, but rather radically modified it.

b. Anaximander

Anaximander (c. 610-c. 545 B.C.E.) followed in Thales’ footsteps (he might have been Thales’ student) by applying his astronomical knowledge to practical life on earth. He was supposed to have invented the gnomon, a simple sundial (Graham 49). He may have introduced the knowledge of the solstices and equinox to the Greeks, as well as the twelve-hour division of the day—knowledge he probably gained from the Babylonians (Graham 49). He travelled extensively, gaining first-hand geographical knowledge. Indeed, he was supposed to have drawn a map of the earth as he knew it (Graham 49).

Like Thales, Anaximander also posited a source for the cosmos, which he called the boundless (apeiron). That he did not, like Thales, choose a typical element (earth, air, water, or fire), shows that his thinking had moved beyond the more possibly evident sources of being. He might have thought that, since the other elements seem more or less to change into one another, there must be some source beyond all these—a kind of background upon or source from which all these changes happen. Indeed, this everlasting principle gave rise to the cosmos by generating hot and cold, each of which “separated off” from the boundless. How it is that this separation took place is unclear, but we might presume that it happened via the natural force of the boundless. The universe, though, is a continual play of elements separating and combining. In poetic fashion, Anaximander says that the boundless is the source of beings, and that into which they perish, “according to what must be: for they give recompense and pay restitution to each other for their injustice according to the ordering of time” (F1).

In the generation of the cosmos as we know it now, human beings came to be from other animals. While it would be inaccurate to call Anaximander the father of the theory of evolution, the history of that theory should at least make mention of his name. Anaximander thought that human beings could not have been at their origin the way that they are now. That is, they must have arisen from some other animals, since human beings need longer stretches of time for nurture than other animals. They could not have survived, he reasons, without the generative help of other animals (Graham 57). He thought that human beings arose from or were at least akin to fish (Graham 59). Beyond this, humans seem to have needed moisture and heat for their generation. More specifically, humans originated with moisture in some sort of shell, and eventually matured, moved onto land, “and survived in a different form for a short while” (Graham 63). What evidence Anaximander might have had to support these claims we can only guess, but his willingness to explain the world on its own terms, without recourse to divine generation or intervention (although he might well have considered the boundless to be divine), is the mark of a new way of thinking.

c. Anaximenes

If our dates are approximate, Anaximenes (c.546-c.528/5 B.C.E.) could have had no direct philosophical contact with Anaximander. However, the conceptual link between them is undeniable. Like Anaximander, Anaximenes thought that there was something boundless that underlies all other things. Unlike Anaximander, Anaximenes made this boundless thing something definite—air. For Anaximander, hot and cold separated off from the boundless, and these generated other natural phenomena (Graham 79). For Anaximenes, air itself becomes other natural phenomena through condensation and rarefaction. Rarefied air becomes fire. When it is condensed, it becomes water, and when it is condensed further, it becomes earth and other earthy things, like stones (Graham 79). This then gives rise to all other life forms. Furthermore, air itself is divine. Both Cicero and Aetius report that, for Anaximenes, air is God (Graham 87). Air, then, changes into the basic elements, and from these we get all other natural phenomena. This means that ostensibly qualitative properties of things, for example, hot-cold, hard-soft, and so forth, are reducible to quantitative properties (McKirahan 51). Since air is boundless, it does not have a beginning or end, but is in a constant state of flux. Air is the morphological thread binding all things together.

A rather convenient psychological takeaway from Anaximenes’ theory is that the soul (psychê), traditionally considered to be breath, is itself airy (Graham 87). So, the individual human soul is in some way divine since each human being partakes of air. Again, it is remarkable that Anaximines, like his fellow Milesians, did not have recourse to Homeric or Hesiodic mythology to explain the world. The Milesians arguably stand at the beginning, at least as the testimony and scant textual evidence has it, of a distinct way of thinking that we consider to be scientific, however primitive it may be. Despite this inclination toward naturalistic explanations of the world, they considered the gods to be thoroughly infused with their world. With the Milesians comes a radical shift in thought. The radical nature of their thinking does not depend upon a rejection of all divinity, but a reformation in the way we think about it. This leads us to Xenophanes, who first explicitly formulated a critique of traditional ways of thinking about divinity.

3. Xenophanes

Xenophanes (c. 570-c. 478 B.C.E.) was from Colophon, north of Miletus in Ionia. He did not remain in Colophon, but travelled around Greece reciting his poetry, finally settling in modern day Sicily. Since his views were expressed poetically, it is at times difficult to know how to interpret them. Thus, we should keep in mind that, while we have more fragmentary material from Xenophanes than all of the Milesians taken together, the way in which his views were expressed, and the fragmentary nature of our sources, prevents us from being certain about what exactly he meant. What exposure he might have had to Milesian thought we do not know. Like the Milesians, however, he challenged traditional theological views, but in a new way. Even his social views seem to have been at odds with the ancient Greek sensibilities. For example, he renounces the glorification and honorific status of athletes, saying that wisdom should be preferred (F2).

Unlike the Milesians (or the evidence we have of them), Xenophanes directly and explicitly challenged Homeric and Hesiodic mythology. “It is good,” says Hesiod, “to hold the gods in high esteem,” rather than portraying them in “raging battles, which are worthless” (F2). More explicitly, “Homer and Hesiod have attributed to the gods all things that are blameworthy and disgraceful for human beings: stealing, committing adultery, deceiving each other” (F17). At the root of this poor depiction of the gods is the human tendency towards anthropomorphizing the gods. “But mortals think gods are begotten, and have the clothing, voice and body of mortals” (F19), despite the fact that God is unlike mortals in body and thought. Indeed, Xenophanes famously proclaims that if other animals (cattle, lions, and so forth) were able to draw the gods, they would depict the gods with bodies like their own (F20). Beyond this, all things come to be from earth (F27), not the gods, although it is unclear whence the earth came. The reasoning seems to be that God transcends all of our efforts to make him like us. If everyone paints different pictures of divinity, and many people do, then it is unlikely that God fits into any of those frames. So, holding “the gods in high esteem” at least entails something negative, that is, that we take care not to portray them as super humans.

We have seen what the gods are not, but what is God or the gods? It is unclear whether or not Xenophanes was a theological monist or pluralist, but he seems at least to hint at either one God only, or one God above all others. “One God, greatest among gods and men…” (F23) could mean that there is one God only, despite the fact that mortals talk about a plurality of Gods, or that there is one God who is greater than all the rest. This God, in his entirety, sees, thinks, hears, and shakes all things by the thought of his mind (F24-F25). He remains, unmoving, in the same place (F26). If God is in some place, does this not mean that he is embodied? This is unclear, but Aristotle claims that Xenophanes thought of God as spherical, presumably based upon the picture of uniformity portrayed in the preceding fragments (Graham 113). We might also wonder whether or not this depiction of God, too, is in some way anthropomorphizing. How do we know that God has a mind, or that he hears, sees, and thinks? Xenophanes does not present us with answers to these questions. Whatever the case, Xenophanes’ God is unlike any previous conceptions of divinity, and seems to have set in motion a long tradition of critical and rational theology.

Ultimately, we can never know the full and simple truth about the gods or anything else. Even if we successfully describe events in our world, we cannot claim knowledge about such things; for, “opinion is wrought over all” (F35). This, however, apparently does not prevent us, through an effort of seeking, from understanding things better. If Xenophanes is a skeptic, therefore, his skepticism is pliable and open-ended. By rejecting dogmas, Xenophanes is willing to make rational conjectures about God.

4. Pythagoras and Pythagoreanism

Ancient thought was left with such a strong presence and legacy of Pythagorean influence, and yet little is known with certainty about Pythagoras of Samos (c. 570-c. 490 B.C.E.). A great deal of legend surrounds the life of Pythagoras. Scholars generally agree that Pythagoras left Samos for Croton, where he enjoyed political esteem as a ruler. His political success, however, was not his philosophical legacy, but instead the almost religious following that developed in his name (perhaps because of his political success). He developed a following that continued long past his death, on down to Philolaus of Croton (c. 470-c. 399 B.C.E.), a Pythagorean from whom we may gain some insight into Pythagoreanism. Whether or not the Pythagoreans followed a particular doctrine is up for debate, but it is clear that, with Pythagoras and the Pythagoreans, a new way of thinking was born in ancient philosophy, and had a significant impact on Platonic thought.

Many know Pythagoras for his eponymous theorem—the square of the hypotenuse of a right triangle is equal to the sum of the squares of the adjacent sides. Whether Pythagoras himself invented the theorem, or whether he or someone else brought it back from Egypt, is unknown. He was accorded almost godlike status among his followers, some saying that there are three classes of rational beings: the gods, human beings, and beings like Pythagoras (Graham 921). He was said to have a golden thigh, to have been hailed by name by the river Cosas, and to have been seen simultaneously in both Metapontum and Croton (Graham 919). Empedocles sung his praises by saying that Pythagoras could, by the power of his mind, behold all things “for ten or even twenty generations of men” (Graham 917).

One doctrine that scholars confidently attribute to Pythagoras and his followers is the transmigration of souls. The soul, for Pythagoras, finds its immortality by cycling through all living beings in a 3,000-year cycle, until it returns to a human being (Graham 915). Indeed, Xenophanes tells the story of Pythagoras walking by a puppy who was being beaten. Pythagoras cried out that the beating should cease, because he recognized the soul of a friend in the puppy’s howl (Graham 919). Another Pythagorean view seems not to have restricted a life cycle to souls, but widened the scope to all things, such that there is nothing completely new, since everything has happened before and will happen again (Graham 919). What exactly the Pythagorean psychology entails for a Pythagorean lifestyle is unclear, but we pause to consider some of the typical characteristics reported of and by Pythagoreans.

Pythagoreans were famous for their silence (Graham 911). Their teachings were transmitted cryptically, and it is unclear how strict of a doctrine the followers were demanded to observe. Some are reported to have refrained from eating or handling beans, either because they resemble genitals or the gates of Hades. Some were commanded not to sacrifice a white rooster, since white symbolized purity and goodness, and because roosters are sacred to Men, and thus roosters announce the sunrise in the morning (Graham 923). There were also the akousmatikoi (things heard), which were expressed in three categories: what something is, what the most x is (for example, what is the wisest?), and what one should or should not do (e.g abstention from beans or sacrificing white cocks). The Oracle at Delphi was said to be the tetractys and, therefore, harmony, which satisfies the first set of akousmatikoi. Number is said to be the wisest, with giving names to things coming in second for wisdom (Graham 923).

Plato and Aristotle tended to associate the holiness and wisdom of number—and along with this, harmony and music—with the Pythagoreans (Graham 499). For example, the decad was sacred. The tetractys shows us the holiness of the number ten.

Here, we can see a relationship among numbers, all of which leads us to a figure. There is the one, which begets plurality (two). When we add three and four to these, there is the sum of ten, which signifies the composition of the cosmos (Graham 499). There were nine visible heavenly bodies, and so the Pythagoreans posited a tenth body, counter-earth, to balance out the cosmos. The tetractys also gives us the ratios of harmony: 1:2, 2:3, and 3:4, or the octave, the fifth, and the fourth, respectively (McKirahan 92). The universe is harmony, and Philolaus considered the soul also to be a harmony (Graham 505). Thus, at least for Philolaus, the soul could be considered to be a type of microcosm.

Perhaps more basic than number, at least for Philolaus, are the concepts of the limited and unlimited. Nothing in the cosmos can be without limit (F1), including knowledge (F4). Imagine if nothing were limited, but matter were just an enormous heap or morass. Next, suppose that you are somehow able to gain a perspective of this morass (to do so, there must be some limit that gives you that perspective!). Presumably, nothing at all could be known, at least not with any degree of precision, the most careful observation notwithstanding. Additionally, all known things have number, and number is classed in two kinds: odd and even (F6). Number, too, can be seen here as a kind of limiter. Each thing is one, and thus separate from other things.

There is evidence to suggest that some Pythagoreans gave credence to a list of opposites in addition to limit-unlimited and odd-even: one-plurality, right-left, male-female, rest-motion, straight-bent, light-dark, good-evil, square-oblong. The left side of each of these binaries would be organized in one column, while the right side would be organized in a parallel column. Although it is unclear how, these columns of opposites somehow give us insight into the basic stuff of the cosmos and of being. Notice also that there are ten pairs of opposites. Limit-unlimited and odd-even are listed first, and these give rise to the rest of the cosmos (McKirahan 97). Thus, the Pythagoreans saw a universe whose nature is numerical, but also one in the tension of harmony, and similar to Heraclitus, the tension of opposites.

5. Heraclitus

Just south of Colophon in Ionia was Ephesus, where yet more new philosophical blood was circulating. Heraclitus (c. 540-c. 480 B.C.E.) stands out in ancient Greek philosophy not only with respect to his ideas, but also with respect to how those ideas were expressed. His aphoristic style is rife with wordplay and conceptual ambiguities. Heraclitus was getting at what he saw as a reality composed of contraries—a reality, too, whose continual process of change is precisely what keeps it at rest. Such a unique style of thought and expression seems to have sprung forth from a life just as unique, and perhaps even contrarian. While we often do well to proceed cautiously with Diogenes Laertius’ accounts of the philosophers, his account of Heraclitus is telling, and fits with Heraclitus’ sometimes scathing thought. Diogenes Laertius calls him “conceited” and “haughty,” citing as evidence Heraclitus’ denunciation of Hesiod, Pythagoras, Xenophanes, and Hecataeus as people who have learned much (literally, polymaths), but understand little. Diogenes Laertius says that Heraclitus “studied with no one, but asserted he inquired of himself and learned everything by himself” (Graham 139). Indeed, when reading Heraclitus, one can easily imagine a loner whose originality of thought was closely linked with, if not born from, that solitude.

He is often critical of the ignorance—that is, the lack of genuine understanding—of the majority of human beings. He speaks of a logos (translatable as “word,” “reason,” “rationality,” “language,” “ratio,” and so forth) that most human beings do not understand, neither before nor after they hear it. Many people are asleep, despite being awake. “Having heard without comprehension, they are like the deaf; this saying bears witness to them: present they are absent” (F6). Pronouncing a sentiment further echoed in Plato and Aristotle, Heraclitus says, “the many are base, while the few are noble” (F12). Most people do not observe the world carefully enough, and few attain a true understanding of it. There is in Heraclitus a distinction between having much information under one’s belt, and understanding how all of it fits together, what it all means, that is, its overall significance.

One might wonder whether or not God, for Heraclitus, is synonymous with reality, so that a real understanding of the universe is an understanding of what is sacred. God is “day night, winter summer, war peace, satiety hunger…” (F103). Fire plays a significant role in his picture of the cosmos. No God or man created the cosmos, but it always was, is, and will be fire. At times it seems as though fire, for Heraclitus, is a primary element from which all things come and to which they return. At others, his comments on fire could easily be seen metaphorically. What is fire? It is at once “need and satiety.” This back and forth, or better yet, this tension and distension is characteristic of life and reality—a reality that cannot function without contraries, such as war and strife. “A road up and down is one and the same” (F38). Whether one travels up the road or down it, the road is the same road. “On those stepping into rivers staying the same other and other waters flow” (F39). In his Cratylus, Plato quotes Heraclitus, via the mouthpiece of Cratylus, as saying that “you could not step twice into the same river,” comparing this to the way everything in life is in constant flux (Graham 158). This, according to Aristotle, supposedly drove Cratylus to the extreme of never saying anything for fear that the words would attempt to freeze a reality that is always fluid, and so, Cratylus merely pointed (Graham 183). Whether or not this is a fair interpretation of Heraclitus, we can see that change plays a central role in his thought. Yet, Heraclitus recognizes that “changing it rests” (F52). So, the cosmos and all things that make it up are what they are through the tension and distention of time and becoming. The river is what it is by being what it is not. Fire, or the ever burning cosmos, is at war with itself, and yet at peace—it is constantly wanting fuel to keep burning, and yet it burns and is satisfied.

6. Eleatic Philosophy

Three important thinkers fall under the category of Eleatic thought: Parmenides, Zeno, and Melissus. The latter was not from Elea as the former two were, but his thought directly inherits the monism typical of Parmenides and Zeno. Thus, Melissus will be treated in this section after Parmenides and Zeno.

a. Parmenides

If it is true that for Heraclitus life thrives and even finds stillness in its continuous movement and change, then for Parmenides (c. 515-c. 450 B.C.E.) life is at a standstill. Haling from Elea (a Greek colony in modern day Italy), and the father of Eleatic philosophy, Parmenides was a pivotal figure in Presocratic thought, and one of the most influential of the Presocratics in determining the course of Western philosophy. According to McKirahan, Parmenides is the inventor of metaphysics (157)—the inquiry into the nature of being or reality. While the tenets of his thought have their home in poetry, they are expressed with the force of logic. The Parmenidean logic of being thus sparked a long lineage of inquiry into the nature of being and thinking.

Parmenides’ poem moves in three parts: a sort of foreword (proemium), a section on Truth, and a section on Opinion (the way of mortals). The narrator of the poem describes allegorically a journey in a chariot, led speedily along by mares, but guided by maidens from the House of Night. He was led to the threshold of the paths of Night and Day, where Justice holds the keys that open the door to each. The maidens persuaded Justice, with gentle words, to open the door between Night and Day, whereupon the travellers were greeted by a goddess, who claims to teach the only paths for thought: “the one: that it is and that it is not possible not to be, is the path of Persuasion (for she attends on Truth); the other: that it is not and that it is right it should not be, this I declare to you is an utterly inscrutable track, for neither could you know what is not (for it cannot be accomplished), nor could you declare it” (F2). The “inscrutable” track is the path of mortals (Opinion), while the former is the path of Truth. Curiously, the goddess urges the sojourner to learn both, claiming that “it is right for you to learn all things.” The goddess suggests that, although the path of Opinion is ultimately wrong-headed, it is nevertheless wise to understand why such a path is one to which many so often cling.

i. The Path of Being

The first path is the path of being. The Greek word esti(n) is the third person singular of the verb to be. It need not express a subject, and does not in Parmenides’ poem. We therefore import the English word “it” into the translation for smooth English. There is much debate about the way Parmenides uses to be in his poem, but the possibilities are these. First, he might have used esti in an existential sense, that is, that something simply exists (for example, Spot exists). Second, he might have meant esti in the predicative sense, for example, “the t-shirt is red.” Third, esti could take a sense of identity, as in, “A=B.” Fourth is the veridical sense, or, “it is true that X.” Finally, there could be some combination of some or all of these senses of esti (Sedley 114-115 and McKirahan, 160-163). Whatever the case, Parmenides does seem to have in mind the whole—all of being. As soon as we differentiate among types of beings, we have entered into the way of Opinion or plurality.

The right way of thinking is to think of what-is, and the wrong way is to think both what-is and what-is-not. The latter is wrong, and the goddess forbids it, simply because non-being is not. In other words, there is no non-being, so properly speaking, it cannot be thought—there is nothing there to think. We can think only what is and, presumably, since thinking is a type of being, “thinking and being are the same” (F3). It is only our long entrenched habits of sensation that mislead us into thinking down the wrong path. We are, as it were, “two-headed” and helpless in our ignorant journey down the path of Opinion, and we mistakenly think that being and non-being are the same.

The goddess names several characteristics of what-is. It is ungenerated and imperishable, whole and one, unperturbed, complete, completely present (without past or future), and continuous. Parmenides makes use of the Principle of Sufficient Reason to say that there is no sufficient reason for being, or what-is, to have been generated at this time or that (McKirahan 167). If at one time it came to be, that means that at one time it was not, which is impossible. It cannot not be, that is, what-is is necessary. Moreover, what-is is motionless, since motion would involve non-being, that is, changing in place or in quality requires going from what is to what is not. It is therefore the same all around and held within a limit, “which confines it round about” (F8.31). Parmenides goes so far as to compare it to a ball, maintaining balance and equal tension in all directions from the center out. It is thus complete. Is it problematic to have being bounded by a limit? Would this not mean that there is something outside being, effectively making what is outside its limits non-being? Apparently, we are to remain resolute in thinking of the sphere as complete and as all being, even though we mortals sometimes mistakenly divide it up, or conceive it as something inside a container.

ii. The Path of Opinion

Now the goddess presents the way of Opinion. She claims that her words about this way will be illusory or deceptive, meaning that the subject matter itself produces the deception. Mortals claim that there is both being and non-being. We observe the world with our senses, and put too much faith in these rather than in reason, which tells us that there is only one true way—being. Oddly, our interpretations of Parmenides become even more obscured when we reach this section. The reader is tempted to believe that Parmenides himself gave at least some degree of credence to mortal opinion. Indeed, we are told that Parmenides considered the earth and fire to be the sources of all that it is. Aristotle says that Parmenides does this in order to explain why, for reason, there is only one eternal being, while for the senses, there is a plurality of beings. Parmenides classified the hot, then, as what-is, and the cold as non-being (Graham 221).

Parmenides must in some way account for the fact that most human beings hold fast to the information that the senses provide. If most of us are in error, it is a subtle and elusive one. Since, by habit, we are so easily convinced of the truth of the senses, Parmenides attempts to explain why this is, and also attempts to give us a more intelligible account of the sensible world. The information we have does not present a clear picture of Parmenides’ vision of the cosmos, but it does give us some ideas of its nature. The hot is responsible for separation, and the cold is responsible for coalescence. Beyond this, Parmenides seems to have been a rather serious astronomer, whose astronomical theory in some important ways prefigures modern astronomy. He may have been the first Greek—the Babylonians already being privy to it—to have claimed the morning and evening star to be the same thing (Graham 225). He also claimed that the moon’s light is a reflection of the sun’s light. He may even have thought that the earth was spherical (Graham 241). Again, the earth, like being, has no reason to move this way or that, due to its equilibrium.

We see in Parmenides a reverence for reason. Even his cosmology is based upon reason rather than the senses alone. In a time before telescopes or any other sophisticated observational technology, Parmenides had to move beyond the evidence of the senses alone to determine that the morning and evening star is the same, and that the moon reflects the sun’s light. To all appearances, the moon somehow generates its own light. Parmenides, however, moved beyond appearances to explain appearances. For this very reason there is also tension in Parmenides’ thought. No matter how much faith we put in reason, and no matter how much we deny the evidence of the senses, the sensory world still convincingly thrusts itself upon us, and demands our thought, attention, and understanding. Perhaps in the end this understanding of the natural world, which to all appearances is a mixture of being and non-being, shows us a unified, eternal and simple being.

b. Zeno

Zeno (c. 490-c. 430 B.C.E.), also a native of Elea, was Parmenides’ student and possibly his boyfriend (homosexuality in ancient Greek culture was fairly common—among intellectuals, the student performed favors in order to receive the teacher’s wisdom). As Daniel Graham says, “Parmenides argues for monism, Zeno argues against pluralism” (Graham 245). That is, Zeno seems to have composed a text wherein he claims to show the absurdity of accepting that there is a plurality of beings. He uses arguments, often in a reductio ad absurdum form, to prove positively that there cannot be plurality, and negatively (or by an implied inference), that the only possibility is that what-is is one. Beyond this, he argued against motion and against place. Suffice to say, Zeno’s paradoxes have since his day provided problems for philosophers and mathematicians alike. Let us examine some of Zeno’s arguments.

i. Arguments against Plurality

Many of Zeno’s arguments can be dizzying. One argument contains an important claim upon which many other arguments have their foundation. There might have been an argument for this claim, but there is none extant (Graham 267). For the sake of clarity, Graham’s summary of this initial claim (claim (a) below) and following arguments will be quoted:

(a) If there are many things, no one has size because it is one and the same as itself.

(b) If each of the many did not have size, it would not exist, for if it were added to or subtracted from something, it would make no difference to that thing.

(c) If there are many things, each must have size and solidity, and hence each must have parts with size and solidity, and similarly each of these parts must have parts.

(d) Hence, if there are many things, they must be both small and large; so small as to have no size, and so large as to be unlimited (infinite). (267)

The set of arguments (b)-(d) is aiming to disprove plurality. These arguments seem somehow to be based upon (a), which seems to be the conclusion of an argument for which we have no premises. At the least, we can see here, if only obscurely, Zeno’s efforts to deny pluralism.

An argument from Plato’s Parmenides goes like this. If there are many things, then each thing will be both like and unlike, and so a contradiction ensues (F1). For example, body X will be like bodies Y and Z in that all three are bodies taking up space. Yet, each of the three will be unlike the other since, let us suppose, X is red, Y is blue, and Z is green. Thus, X is both like and unlike Y and Z. If this is all there was to Zeno’s argument, as Plato presents it (perhaps simply for the dramatic purposes of the dialogue), then it is not a contradiction, since each body is like and unlike the other in different respects (McKirahan 182).

Zeno shows that if we attempt to count a plurality, we also end up with an absurdity. If there are is a plurality, then there would be neither more nor less than the number that they are. Thus, there would be a finite number of things. On the other hand, if there is a plurality, then the number would be infinite, because there is always something else between existing things, and something else between those, and something else between those, ad infinitum. Thus, if there were a plurality of things, then that plurality

would be both infinite and finite in number, which is absurd (F4).

ii. Dichotomy

A central argument, at least in what we have available of Zeno’s work, is what the ancients called the argument from dichotomy. There are two versions of this argument. In the first, we suppose that what-is is divisible, and then we end up with two absurdities. If it is divisible, it will be divided down into a an infinite number of finite parts, or it will be divided so much that nothing at all is left over. The first option is less clear. Zeno probably has in mind that an infinite number of finite parts would go to make up something that is infinitely great in size when taken as a whole (as above). The second option is clearly absurd. Therefore, being or what-is is one and indivisible (Graham 259).

iii. Infinite Divisibility and Arguments against Motion

The idea of infinite divisibility plays a key role in many of Zeno’s arguments. For example, let us look at his arguments against motion. It is impossible for a body in motion to traverse, say, a distance of twenty feet. In order to do so, the body must first arrive at the halfway point, or ten feet. But in order to arrive there, the body in motion must travel five feet. But in order to arrive there, the body must travel two and a half feet, ad infinitum. Since, then, space is infinitely divisible, but we have only a finite time to traverse it, it cannot be done. Presumably, one could not even begin a journey at all. Aristotle criticized this argument by saying that there are two senses of “infinite” with reference to magnitudes: there is infinite divisibility and infinity with reference to extremes (Graham 261). We cannot get through an infinite quantity in a finite time, but one can get through an infinitely divisible space, because time is also infinitely divisible. If there is a parallel between the divisibility of space and time, then we can cross an infinitely divisible span of space, because there will be a bit of time measuring each bit of the motion in which to do it.

Similar to this argument is the Achilles argument. Swift-footed Achilles will never be able to catch up with the slowest runner, assuming the runner started at some point ahead of Achilles, because Achilles must first reach the place where the slow runner began. This means that the slow runner will already be a bit beyond where he began. Once Achilles progresses to the next place, the slow runner is already beyond that point, too. Thus, motion seems absurd.

Again, an arrow flying from point A to point B is actually not in motion. At each moment in its apparent flight, it occupies a place equal to its size. If something occupies a place equal to itself, it must be at rest, since nothing can be in a place equal to itself while in motion. Thus, the arrow is not actually in flight, but at rest in its place. Aristotle’s criticism here is that Zeno assumes time to be composed of indivisible moments or “nows.” Now the arrow is here, and now it is here, and now it is here, and so on. The other assumption of Zeno’s argument is that something is only in a place when it is at rest. He also argues against place, however, by saying that if something is in a place, then that place must be in a place, and that place must be in a place, ad infinitum. Thus, if everything is in a place, then there would be infinite places of those places, and this is absurd (Graham 261).

The most conceptually difficult argument is the Stadium or Moving Rows paradox. Suppose there is a set of bodies at one end of a racetrack and one at another. They will both move in opposite directions at equal speeds and will thereby run past one another. They will both pass by a third set of stationary bodies equal in size to the racing bodies. The Stadium paradox is often illustrated in the following way.

The Bs and Cs are in motion, while the As are stationary. The Bs and Cs are moving at an equal and constant rate of speed. Since their starting point is the middle A, so to speak, it should take the Bs and Cs twice as long to bypass each other as it takes them to bypass the As. That is, the rightmost B must move past only one A, while it must move past two Cs, and the leftmost C must move past two Bs, but only one A. The Cs and Bs have therefore moved across both a longer and a shorter distance at the same time; thus the contradiction (Graham 263). Aristotle, however, says that this reasoning is fallacious since the Bs and Cs are in motion. Since they are in motion, and moving at an equal speed, it will take them half as long to move past each other as it does to pass a stationary A (Graham 263). Some commentators, thinking that Zeno could not possibly have made such an egregious error, suppose that Zeno might have intended for each body in the row to be atomic, i.e., indivisible. If this were the case, then a B cannot move past only half of an A or a C (since they are indivisible), but must move past the whole body at once. Thus, Zeno’s paradox would remain intact, although we have no textual evidence that this is what Zeno had in mind (McKirahan 192).

The final paradox is the millet seed paradox, which is either given to us in an incomplete way, or is simply fallacious. If a bushel of millet seeds dropped, it will make a sound. If this is true, then one millet seed when dropped should also make a sound, and one ten thousandth of a part should as well. But this does not happen. As it is, there are two problems with this argument. On the surface, we do not know what Zeno meant to prove from this. Logically, the argument commits the fallacy of division. Just because the whole (the bushel) makes a sound when dropped, we cannot conclude that any given part (one ten thousandth of a seed) will as well (Graham 265). Whatever the case, the overall picture of Zeno is of his fight against plurality and motion for the sake of monism.

c. Melissus

We know little about Melissus’ life except that he was an admiral and organized a battle against the Athenians (c. 441 B.C.E.). Philosophically, he clearly defends Parmenidean monism, although he does differ from Parmenides on at least two counts: the temporality of what-is, and whether or not what-is is unlimited or limited. He also differs from Zeno by laying out a clear thesis defending the unity of being.

Melissus sets out a system of concomitant and sequential arguments. First, what-is, or being, cannot have come from nothing. Nor could being have come to be from what-is, because this would mean that being already was. Likewise, and perhaps inversely to the first principle, being cannot become non-being. It therefore cannot perish. So, being, or what-is, is everlasting. Next, since it is everlasting—it does not come to be or perish—it has no limits set upon it, and so it is unlimited (apeiron). From this, we can see that being is one. If it were two or more, then each would be limited by the other. This leads us to see that what-is must be the same as itself, and therefore cannot be subject to the throes and flux of rearranging, pain, or any other sort of passion. Closely related to this, what-is must be motionless, since motion is a type of change. Similarly, there is no void, since the void would be nothing. This is another reason why what-is cannot move. To move, there must be emptiness or void, but since void cannot exist, we are left with fullness, that is, being is a plenum (Graham 467).

What is Melissus’ answer to the objection that we clearly observe with our senses flux and change in the world? He claims that there is only one thing that follows from his thesis. If there really is earth, fire, different types of metals, and so forth, then they must be like the one or what-is—they must each be as we first perceive them to be, for example, this here is fire, and that there is earth, and nothing else. However, when we think we see something hot becoming cold, then we simply have not observed correctly. “For they would not change if they were real, but they would remain just such as each appeared to be” (F8). Melissus does not explain what it is about our observation that goes awry. How is it that we make mistakes like thinking that we have observed a metal corroding? Melissus has no satisfactory answer to this question. If, he says, we observe correctly—if what we observe is real—it cannot change. He wants to hang on to an idea of reality where the elements, at least, remain. If we see fire, then there is always fire, despite this particular blaze burning out. Although this or that fire may be extinguished, fire is not extinguished.

7. Philosophies of Mixture

Anaxagoras and Empedocles are alike in at least two ways: first, they adhere to the Eleatic principle that being is necessary, that is, it is impossible for being not to be; second, and related to this Eleatic principle, being cannot be generated, nor can it perish, and thus all being is a continual process of mixture and separation.

a. Anaxagoras

Anaxagoras of Clazomenae (c. 500-c. 428 B.C.E.) had what was, up until that time, the most unique perspective on the nature of matter and the causes of its generation and corruption. Closely predating Plato (Anaxagoras died around the time that Plato was born), Anaxagoras left his impression upon Plato and Aristotle, although they were both ultimately dissatisfied with his cosmology (Graham 309-313). He seems to have been almost exclusively concerned with cosmology and the true nature of all that is around us. In fact, some ancient authorities have even called him an atheist (Graham 305). This might be due to his purely naturalistic explanations of the world. He thought, for instance, that the sun, moon, and other heavenly bodies were fiery stones rather than divinities (Graham 297). He is also thought to have explained—more or less correctly—the phenomenon of hail (Graham 303). As we shall see, Anaxagoras called upon his senses to do their work, but also his mind to look beyond what could be seen into the causes for all things.

Before the cosmos was as it is now, it was nothing but a great mixture—everything was in everything. The mixture was so thoroughgoing that no part of it was recognizable, due to the smallness of each thing, and not even any colors were perceptible. He considered matter to be infinitely divisible. That is, because it is impossible for being not to be, there is never a smallest part, but there is always a smaller. If the parts of the great mixture were not infinitely divisible, then we would be left with a smallest part. Since the smallest part could not become smaller, any attempt at dividing it again would presumably obliterate it. The infinitely divisible parts seem to have at least been mixtures of elemental or basic stuffs—earth, wet and dry, hot and cold, and “seeds” (sperma). The nature of these seeds is unclear. They might have been simply the germ of generation or small bits of elemental things. At any rate, these seeds and all other things were mixed together prior to separation (F1-F5).

The separation of the thoroughgoing mixture was generated by a high-speed centrifugal spin (F7). It was the force and speed of the spinning that caused the separating off of each being from the other. However, this separation was not a complete purification or isolation of parts. In fact, beings in the world as we know it, says Anaxagoras, are still mixtures (F8). Everything is still in everything. The difference is that the separating force generated recognizable and individuated beings. So why, then, does gold appear to us as gold and not, say, bone, since everything is in everything? A gold coin is considered to be gold because it has more gold than anything else. The predominant bits, in other words, make up the being as we know it (McKirahan 213). The question of how something small, like a gold coin, could ever hold bits of everything in it goes unanswered in our existing information of Anaxagoras.

The processes of mixture and separation are unceasing. Generation, says Anaxagoras, is mixing, and what appears to be perishing is really separation (F11). This has profound implications for what we consider to be human mortality. Under Anaxagoras’ cooperating principles of mixture and separation, what appears to be change into non-being (death) is impossible. We might surmise that what we call death is nothing more than a separation of these parts (this particular human body) and a mixture back into those parts (the earth). Likewise, a birth cannot be a creation out of nothing. The birth began as a mixture of seeds, which themselves were presumably already mixtures of other things. What comes to be cannot come from what is not. So, generation relies upon what already is. The Anaxagorean world, then, is a continuous play of being. Like the Eleatics, Anaxagoras relies upon the idea that what-is cannot possibly not be, that is, being is necessary. Also like the Eleatics, the senses, for Anaxagoras, do not give us an exhaustively accurate picture of reality—we must rely upon reason to make sense of the world. The difference between Anaxagoras and the Eleatics, however, is that Anaxagoras allows for change and natural processes to take place, without reducing these processes to sensory illusions.

There is one important player in this continuous play of being yet to be mentioned: mind (nous). Although mind can be in some things, nothing else can be in it—mind is unmixed. We recall that, for Anaxagoras, everything is mixed with everything. There is some portion of everything in anything that we identify. Thus, if anything at all were mixed with mind, then everything would be mixed with mind. This mixture would obstruct mind’s ability to rule all else. Mind is in control, and is responsible for having started the spinning of the great mixture, such that individual beings were generated in the process of separation. Everlasting mind—the most pure and fine of all things—is responsible for ordering the world. Thus, Anaxagoras’ world is not a chaotic process of mixture and separation; rather, the processes of mixture and separation are ordered by mind, which is unmixed.

Anaxagoras left his mark on the thought of both Plato and Aristotle, whose critiques of Anaxagoras are similar. In Plato’s Phaedo, Socrates recounts in brief his intellectual history, citing his excitement over his discovery of Anaxagoras’ thought. He was most excited about mind as an ultimate cause of all. Yet, Socrates complains, Anaxagoras made very little use of mind to explain what was best for each of the heavenly bodies in their motions, or the good of anything else. That is, Socrates seems to have wanted some explanation as to why it is good for all things to be as they are (Graham 309-311). Aristotle, too, complains that Anaxagoras makes only minimal use of his principle of mind. It becomes, as it were, a deus ex machina, that is, whenever Anaxagoras was unable to give any other explanation for the cause of a given event, he fell back upon mind (Graham 311-313). It is possible, as always, that both Plato and Aristotle resort here to a straw man of sorts in order to advance their own positions. Indeed, we have seen that Anaxagoras’ principle of mind set the great mixture into motion, and then ordered the cosmos as we know it. This is no insignificant feat.

b. Empedocles

We have an extensive poem from Empedocles of Acragas (near Sicily). He lived from 495-435 B.C.E., overlapping with Anaxagoras and Socrates. Much legend surrounds his life, and it is of course difficult to distinguish fact from fiction. He was a philosophical adherent to a Parmenidean principle of being, that is, what-is cannot not be (Graham 333). Politically, he was an advocate for democracy (Graham 335). Religiously, he seems to have been a Pythagorean, advocating a particular diet (F146-147) and endorsing the doctrine of the transmigration of souls (F124). He was reportedly a physician with a penchant for magic and prophecy. He was supposed to have kept alive a woman who neither breathed nor ate for thirty days (Graham 333). He was reportedly a self-proclaimed god, wearing purple robes, bronze shoes, and a gold wreath. To show his divinity, we are told that he leapt into the volcano at Mount Etna, purifying himself of his body (Graham 337). Legend notwithstanding, we have a substantial amount of his poetry, even if it is at times cryptic.

i. Macrocosm

At its most basic, the cosmos consists of a total of four elements or “roots,” plus two forces that are responsible for combining and separating these elements (F9). Empedocles was the first to name the four elements as earth, air, fire, and water. Love is the force that brings these elements, and the things generated from them, together, while Strife rends them (Graham 347). Empedocles, in Fragment 20 for example, repeatedly refers to a ceaseless cycle of unity from plurality (a movement of Love), and plurality from unity (a movement of Strife). While the names Empedocles uses for these forces might seem to us to carry moral overtones (Love as good and Strife as bad), they appear to be morally neutral for Empedocles—Love and Strife are simply the natural forces that guide the ceaseless motion of being.

Reminiscent of Anxagoras’ mixture, Love holds all things together in perfect unity when it reigns supreme. As Strife begins to hold sway, the unity is pulled apart, presumably producing the sorts of singular beings we see all around us now. Empedocles makes clear, however, that these cycles are not cycles of production out of nothing or perishing into nothing (F11). What-is, or being, never ceases to be, and something cannot come from nothing, nor can anything utterly perish into nothing. Human beings are simply mistaken when we claim that this is how the world works. Empedocles claims to employ the language of birth and death only as a matter of convention, recognizing that the truth is always at hand (F12). Love and Strife are not only responsible for the unification and pluralization of the elements and all things, but they are at play in the world as we know it now. Through an everlasting play of alteration, some things are repelled by one another through Strife, and others are brought together through Love. Some things are fitted for blending, and others are prone to separation (F23). Empedocles likens this to painters mixing colors, some more and some less, in order to create a painting (F24). Likewise, Love and Strife (the painters) bring together and pull apart the primeval elements. Everything that was, is, and will be owes its being to the play of Love and Strife.

How the process of mixture and separation happens is unclear. Empedocles tells us that there is a vortex. When Love is at the middle of the vortex, all things are unified—all things come from their respective places to join together in Love. All the while, Strife is retreating to the outside of the vortex. When Strife gains the strength to do its work, then there is the separation of the elements. First air (or aether) was separated off, then fire, earth separated off next, and then water gushed forth from earth as a result of the pressure of the heavenly rotations. When everything is in complete separation, nothing of our world is recognizable. We are, presumably, now living in a world wherein Love and Strife are both at work, with neither one dominating (McKirahan 269-270).

ii. Microcosm

Despite the predominance of his macrocosmic metaphysics in the surviving works and fragments, Empedocles did reason on the microcosmic and physical level. Different kinds of flesh seem to have been generated from different blends of the four elements (Graham 381). For human beings, perception and intelligence are keener in those whose elements are mixed more equally. Perception and intelligence, in fact, seem proportionate to one another—the more perceptive a being, the more intelligent the being (Graham 403). Moreover, thought seems to be a function of blood circulation, and Empedocles identifies the area around the heart as the area for thought (F115). There may also be a connection here with his theory of respiration. Inhalation occurs when blood retreats from tubes (presumably in the nose) and air fills those tubes. Blood rushing back into the tubes forces the air out (F78). Perception itself seems to occur when certain “effluences” from the perceived thing flow through the medium (air or water, for example) and into the pores of the sense organs. One sense cannot sense the object of another sense because the size and nature of the pores will not allow it. For example, the eyes seem to contain light or fire, and let in a certain amount of light. The ears, however, receive sound when the air outside moves and strikes the inner ear causing an echo (Graham 401).

Sometimes Empedocles describes himself as a fugitive from the gods (F8), and sometimes as himself a god who speaks the truth: “When I come to other flourishing cities I am revered by them, men and women alike” (F120). And again, “But why do I urge these things, as if doing some great deed, if I am superior to mortal men who perish many times?” (F121). At times, it seems he is a fallen god, and that humankind is for the most part fallen from divinity (F8). He seems to have advocated some type of transmigration of souls (F124), with reincarnation being based upon the purity of one’s life (Graham 415). Whether or not dietary and spiritual purity will result in salvation or re-divination is unclear. It does seem, however, that physical and spiritual purity, and intellectual prowess brings us closest to divinity. After a “fast from wickedness” (F150), “they become prophets, singers of hymns, physicians, and leaders among men on earth; afterwards they blossom as gods foremost in honors” (F153). Through it all, Love and Strife dominate and dictate the cycles of being.

8. The Atomists

Ancient atomism began a legacy in philosophical and scientific thought, and this legacy was revived and significantly evolved in modern philosophy. In contemporary times, the atom is not the smallest particle. Etymologically, however, atomos is that which is uncut or indivisible. The ancient atomists, Leucippus and Democritus (c. 5^th century B.C.E.), were concerned with the smallest particles in nature that make up reality—particles that are both indivisible and invisible. They were to some degree responding to Parmenides and Zeno by indicating atoms as indivisible sources of motion, while Parmenides and Zeno considered the world to be indivisible and motionless. Since we have very little from his teacher, Leucippus, the focus here will be on Democritus’ thought.

a. Ontology

Despite the fact that Democritus was supposed to have been a prolific writer (we are told that he wrote approximately seventy books), we now have very little of his writing on atomism (Graham 521-525). What seems clear, however, is that Democritus thought that reality is made up of the full and the empty (void). The full is what-is, and the void is what-is-not (F4). Curiously, however, Democritus said that what-is is no more than what-is-not, that is, they have the same ontological status—each is as real as the other. We might interpret this, along with Aristotle, as meaning that there is body (the full), and there is void, and neither has any higher degree of being than the other. That the void is, is as much an ontological fact as the being of the plenum (Graham 525).

Atoms—the most compact and the only indivisible bodies in nature—are infinite in number, and they constantly move through an infinite void. In fact, motion would be impossible, says Democritus, without the void. If there were no void, the atoms would have nothing through which to move. Atoms take on a variety, perhaps an infinite variety, of shapes. Some are round, others are hooked, and yet others are jagged. They often collide with one another, and often bounce off of one another. Sometimes, though, the shapes of the colliding atoms are amenable to one another, and they come together to form the matter that we identify as the sensible world (F5). This combination, too, would be impossible without the void. Atoms need a background (emptiness) out of which they are able to combine (Graham 531). Atoms then stay together until some larger environmental force breaks them apart, at which point they resume their constant motion (F5). Why certain atoms come together to form a world seems up to chance, and yet many worlds have been, are, and will be formed by atomic collision and coalescence (Graham 551). Once a world is formed, however, all things happen by necessity—the causal laws of nature dictate the course of the natural world (Graham 551-553).

Figure, order, and position (or orientation) serve as the basic marks of distinction among atoms and the things that are (F4). Leucippus and Democritus seem to have identified these distinguishing marks as contour, contact, and turning (or rotation), respectively. These three determine which atoms combine to form elemental bodies like fire and water. It is important to note, however, that atoms themselves are immutable. The sensible world is generated from their combination, and things perish when some force causes the dispersal of the atoms.

b. Perception and Epistemology

Atoms are also responsible for sense perception and thought. Atoms of particular shapes are responsible for particular tastes, for example, round atoms are responsible for sweet tastes, while sour flavors consist of rough and angled atoms (Graham 581). Touch works similarly. Sight, hearing, and smell, however, are in some sense reducible to touch. Sensed objects always have effluences (Graham 585). We can see a tree, for example, because the tree’s atomic form somehow flows from it and makes contact with the atoms making up the eye, and the image of the tree is therefore carried into the eye. This might raise the problem of how effluences from large objects (for example, buildings) can fit into an object as small as the eye, but it could be that the effluences are somehow condensed before entering the eye (McKirahan 332). Democritus’ view of perception has important consequences for his epistemology.

If what we perceive are effluences of things, we do not perceive the things themselves; thus, we cannot know things as they are in themselves, but only as they appear to us (Graham 624). The truth is that there are atoms and void, all else is opinion and convention. It was said above that certain types of atoms are responsible for certain types of tastes, but even here convention and relativity have the final word. When certain atoms from certain objects come into contact with the atoms of different perceivers, what is sweet to one person might taste bitter to another. “By convention bitter, by convention hot, by convention cold, by convention color, but in reality atoms and void” (F32a). More precisely, we thoroughly understand very little, “but we perceive what changes in relation to the disposition of the body as things enter or resist” (F33). Even the human soul is a certain configuration and balance of atoms, and the best we can do is think, even if we cannot know much. In this way, Democritus is seen to be influential for Skepticism (Graham 516), but he is not a thoroughgoing skeptic since he claims that atoms and void can be known.

c. Ethics

While we have scant direct access to Democritus’ physical theory, we have an abundance of his own words regarding ethics. Most of his ethical thought comes to us in pithy aphorisms, with a central theme of contentment and freedom from disturbance. Well-being is founded upon contentment and being undisturbed, and these are attained by doing what is truly beneficial for oneself (Graham 633-635). The measure of what is beneficial is pleasure and pain, or joy and sorrow (F150b). It is clear, however, that Democritus does not condone sensual hedonism. In other words, there seems to be a loftier standard for what counts as pleasurable or joyful. “Those who get pleasure from the belly, when they exceed what is appropriate in food, drink, or sex, all find their pleasures are brief and short-lived, lasting only as long as they are eating or drinking, and their pains many” (F149). Constantly and excessively seeking pleasure in the flesh leads only to pain. By contrast, “reason is accustomed to take joys from itself” (F154). So, it is intellectual pleasure that is truly beneficial, and is the best measure of the best sort of life.

Reminiscent of Heraclitus, Democritus says that the best sort of person sees greater value in thinking than in polymathy (F203), and greater value in good action than in words about goodness (F267-F268). Fools leave things to chance (F105), while the wise person thinks, learns, and plans according to intelligence (F93). Interestingly, there is here a juncture of Democritus’ physical thought and his ethics. If the soul is a configuration of atoms, then teaching, learning, thought, and wisdom can help to refigure the soul and free us from the tyranny of chance (Vlastos 55-57). Pleasure and pain figure significantly into Democritean ethics, but it is pleasure of a higher sort that is constitutive of a good life. Reigning in one’s desires is not sufficient for the best sort of life. “Goodness is not just avoiding wrongdoing but avoiding even the desire for it” (F83). Seeking sensual pleasures leads to a disordered and painful life, while seeking the pleasures of wisdom and understanding furnish us with a harmonious and cheerful life.

9. Diogenes of Apollonia

Scholars do not know about Diogenes’ life. He might have been active in the middle or late fifth century (McKirahan 346). We do know, however, that he resurrected material monism. Like Anaximenes, he posited air as the primary element. Unlike the records and fragments that we have of Anaximenes, Diogenes makes explicit the reason why there must be an essential and common element. “My view, in general, is that all existing things are altered from the same thing and are the same thing” (F2). Evidently, based upon the purported introduction to his text—assuming that what was just quoted immediately succeeds the introduction—Diogenes takes this to be an indisputable starting part (F1). If everything in the cosmos were different, having no nature in common, then nothing would be able to mix with anything else, for example, no plant would be able to grow from the earth. Thus, apparent difference in being is only a variation on the same type of being. The whole cosmos is a constant alteration of one being.

Why must this common or basic being be air? Animals, including human beings, cannot live without respiration, that is, air is essential for life. Following a traditional view, Diogenes considers air to be the soul or life of animals. When respiration ceases, life (the soul) leaves the body (F6). Soul, life, and air are treated synonymously in this context (F10). Moreover, air is also responsible for intelligence (F5). Again, when one ceases to breathe, one is no longer intelligent. As intelligence, air “steers and controls all things;” therefore, air seems to be “God, and to reach everywhere, to arrange all things, and to be present in everything” (F5). Everything partakes of air, but nothing partakes of air in quite the same way, “but air itself and intelligence have many forms” (F5). Sometimes air is “warmer or colder, drier or moister, and more stationary or more lively in motion, and many other differentiations are present in it, including countless differentiations of flavor and color” (F5). The differentiations of air range from the most obvious, to those so subtle we can scarcely imagine.

Diogenes tells us that no two differentiated things can become exactly like one another without becoming the same. “Nothing…of those things that are differentiated one from another is able to become exactly like the other without becoming the same” (F5). In other words, no two things can be identical and simultaneously be distinct from one another. If two or more things are identical, then they are not distinct, but the same thing, and we have no way of distinguishing between them. There are many differences among beings in the cosmos; yet, the underlying nature remains the same. This allows for varying degrees of life and intelligence among beings. Therefore, there is no reason to lump Diogenes in with the traditional and shortsighted view that only human beings have intelligence. Other beings might have intelligence as well, but to varying degrees. Air allows for the eternal being of the cosmos, the differentiation and intelligence of all things.

10. The Sophists and Anonymous Sophistic Texts

As with the terms “cynic” and “stoic,” our modern usage of “sophistry” comes to us from a school of thought, which took its course in approximately fifth century Greece. Again, as with “cynic” and “stoic,” the current connotations of “sophistry” are not without their roots in the historical group of thinkers called Sophists. Yet, as we cannot reduce the thought of the Cynics and Stoics to mere cynicism or apathy, we cannot reduce the thought of the Sophists to mere sophistry. As we have seen, it has been tempting to read the Presocratics through the lens of Plato’s and Aristotle’s thought, and this is no less the case with the Sophists. In fact, two of Plato’s dialogues are named after Sophists, Protagoras and Gorgias, and one is called simply, The Sophist. Beyond this, typical themes of Sophistic thought often make their way into Plato’s work, not the least of which are the similarities between Socrates and the Sophists (an issue explicitly addressed in the Apology and elsewhere). Thus, the Sophists had no small influence on fifth century Greece and Greek thought.

Broadly, the Sophists were a group of itinerant teachers who charged fees to teach on a variety of subjects, with rhetoric as the preeminent subject in their curriculum. A common characteristic among many, but perhaps not all, Sophists seems to have been an emphasis upon arguing for both opposing sides of a case. Thus, these argumentative and rhetorical skills could be useful in law courts and political contexts. However, these sorts of skills also tended to earn many Sophists their reputation as moral and epistemological relativists, which for some was tantamount to intellectual fraud.

a. Protagoras

One of the earliest and most famous Sophists was Protagoras (c. 490-c. 420 B.C.E.). Only a handful of fragments of his thought exist, and the bulk of the remaining information about him found in Plato’s dialogues should be read cautiously. He is most famous for the apparently relativistic statement that human beings are “the measure of all things, of things that are that they are, of things that are not that they are not” (F1b). Plato, at least for the purposes of the Protagoras, reads individual relativism out of this statement. For example, if the pool of water feels cold to Henry, then it is in fact cold for Henry, while it might appear warm, and therefore be warm for Jennifer. This example portrays perceptual relativism, but the same could go for ethics as well, that is, if X seems good to Henry, then X is good for him, but it might be bad in Jennifer’s judgment. The problem with this view, however, is that if all things are relative to the observer/judge, then the idea that all things are relative is itself relative to the person who asserts it. The idea of communication is then rendered incoherent.

On the other hand, Protagoras’ statement could be interpreted as species-relative. That is, the question of whether and how things are, and whether and how things are not, is a question that has meaning (ostensibly) only for human beings. Thus, all knowledge is relative to us as human beings, and therefore limited by our being and our capabilities. This reading seems to square with the other of Protagoras’ most famous statements: “Concerning the gods, I cannot ascertain whether they exist or whether they do not, or what form they have; for there are many obstacles to knowing, including the obscurity of the question and the brevity of human life” (F3). It is implied here that knowledge is possible, but that it is difficult to attain, and that it is impossible to attain when the question is whether or not the gods exist. We can also see here that human finitude is a limit not only upon human life but also upon knowledge. Thus, if there is knowledge, it is for human beings, but it is obscure and fragile.

b. Gorgias

Not far behind Protagoras was Gorgias (c. 485-c. 380 B.C.E.). Perhaps flashier than Protagoras when it came to rhetoric and speech making, Gorgias is known for his sophisticated and poetic style. He is known also for extemporaneous speeches, taking audience suggestions for possible topics upon which he would speak at length. His most well-known work is On Nature, Or On What-Is-Not wherein he, contrary to Eleatic philosophy, sets out to show that neither being nor non-being is, and that even if there were anything, it could be neither known nor spoken. It is unclear whether this work was in jest or in earnest. If it is the former, then it was likely an exercise in argumentation as much as it was a gibe at the Eleatics. If it was in earnest, then Gorgias could be seen as an advocate for extreme skepticism, relativism, or perhaps even nihilism (Graham 725).

On Nature can be summarized as follows. If there is anything, it is either exclusively what-is or what-is-not, or both what-is and what-is-not are. Gorgias then eliminates each of these possibilities, beginning with what-is-not (non-being). If non-being were, then there is a contradiction—it would simultaneously both be and not be. Moreover, if non-being were, then being (what-is) would not be, but then non-being would have the property of being, and being would have the property of non-being, which is absurd. Neither, however, is there being (what-is). If being were, it would have to be everlasting, generated, or both. If it were everlasting, it must have always been, and thus would be unlimited. But if it were unlimited, it would not exist anywhere, since for anything to be, it must be in some place, and this place must be different from that which is in it. Being cannot be generated, because if it were, it would have come to be from something that is (being), or from something that is not (non-being). If the former were the case, the being already was and did not need to be generated. If the latter case, then non-being would have caused being, which would be absurd. Finally, being both everlasting and generated would be a contradiction, since if it were everlasting, it could not have been generated, and vice versa (Graham 741-743).

Moreover, even if there were anything, then it could not be thought or known. In order for being to be the object of thought, then being must be, because if there is no being to be thought, it cannot be thought (Graham 743). Yet, if objects of thought were the same as what-is, then whatever we happen to think (unicorns, centaurs, and so forth) would be, but this is absurd (Graham 745). In addition, if objects of thought were things that are, then we would not be able to think of anything that is not, but since we can think of things that are not (unicorns, centaurs, and so forth), objects of thought cannot be tantamount to things that are.

Finally, even if we could think what-is, we would not be able to communicate it. We perceive objects that are different from us, for example, a table, a song, or a scent. We perceive these things by the respective senses, that is, sight, sound, and smell. We communicate by speech, but speech is not the same thing as what is perceived. “That by which we communicate is speech, but speech is not the subsisting and existing things themselves” (Graham 745). Thus, when we talk about the table, the song, or a particular scent, we do not communicate those very things to each other, but rather we communicate words. Just as, therefore, a sight cannot become a sound and vice versa, a perceived thing cannot become speech and vice versa. Again, whether this was all a mere jocular exercise in argumentation or an earnest stab at truth is unknown. If, however, it was the latter, then we seem to be left speechless in a world that is impossible to understand.

c. Antiphon

Very little is known about Antiphon the Sophist. He seems to have been known for courtroom speeches, dream interpretation, and claiming to heal depression (Graham 789). His views on justice and law are perhaps most salient in the extant fragments. Justice amounts to obeying the laws of the city in which one is a resident, but doing so only when others are present to witness it. When alone, it is better to value “the works of nature. The works of law are factitious, whereas those of nature are necessary” (F46a). The debate between law/custom (nomos) and nature (phusis) was a central theme of philosophical and sophistic thought in ancient Greece. To what degree is law natural? Is morality simply law and custom, or is it natural? Antiphon set law in opposition to nature, although it is unclear what he means by “the works of nature.” Antiphon could be interpreted as an advocate for hedonism. Indeed, things that bring pleasure, he claims, are truly advantageous and beneficial, thus following the course of nature. Things that bring pain, on the other hand, are not advantageous (F46a).

If we do read Antiphon as a hedonist, it would have to be a tempered hedonism that distinguishes between good and bad pleasures. He belittles the pleasures of sexual intercourse, claiming that such pleasures “do not travel alone, but in the company of sorrows and pains” (F51). He also looks with a critical eye towards money making, warning against miserliness. He recounts the story of a man whose hidden store of money was stolen. “His friend told him not to worry, but to put a stone in the same place where the money had been and imagine that he still had the money and he had not lost it. ‘For even when you had it, you did not use it at all; hence, do not feel deprived of it even now’” (F57). The lesson here seems to be that if one is going to make money, then one should use that money, for money stored away becomes superfluous. He also warns against doing evil to one’s neighbor, since this will necessarily incur evil for the perpetrator (F61). Moreover, “nothing is worse for men than a lack of discipline,” so we should raise our children well, and when they grow up, great changes will not overwhelm them (F64). So if we are to read Antiphon as a hedonist, then it is a hedonism that works towards what is truly advantageous for oneself—a hedonism tempered by practical wisdom.

d. Prodicus

Prodicus of Ceos (c. 465-c. 395 B.C.E.), like most Sophists, worked as teacher and rhetorician. Like Protagoras, he presented a challenge to theistic thinking, but took this challenge further. The Greeks and Egyptians tend to consider all beneficial things to be gods. “Sun, moon, rivers, springs, and in general everything that benefits our life the ancients considered gods on account of the benefit accruing from them, just as the Egyptians make the Nile a god…” (F3c). This, of course, is not enough evidence to suggest that Prodicus was an atheist (although that word was broader for the ancients than for us, referring to those who hold no belief in gods, and to those who hold unorthodox beliefs in the gods), but it certainly represents a challenge to common theistic notions that the gods are independent of our judgments about them.

Plato portrays Prodicus as a specialist in correct diction. In the Cratylus, Socrates says,

The study of words is not a minor undertaking. If I had heard Prodicus’ fifty-drachma lecture, which provides the student complete instruction on this subject, as he himself advertises, nothing would keep me from telling you straightaway the whole truth about correct diction. But alas I have not heard it, but only the one-drachma lecture. (Graham 847)

This humorous passage is typical of Plato’s emphasis on the Sophist’s method of charging large sums of money for instruction. In fact, in the Hippias Major Plato says of Prodicus that “it is amazing how much money he took in by putting on demonstrations and instructing the young men” (Graham 843). As Graham points out, however, “The ability to make fine discriminations of words is important to rhetoric, and we should remind ourselves that there were no dictionaries in the classical age, and treatises such as Prodicus wrote were the first essays in lexicography and diction” (860). Thus, while Plato treat Prodicus with more respect than other Sophists, we should be aware that his agenda is in part to contrast Prodicus with Socrates, who claimed to teach nothing and to charge nothing for his discussions (compare with the Apology), and that Prodicus’ thought might have been far more important that Plato considered it to be.

e. Anonymous Texts

Two anonymous texts called the Anonymous Iamblichi and the Dissoi Logoi represent different ends of the spectrum of sophistic thought. The Anonymous Iamblichi is primarily an ethical work, dealing with reputation, virtue, and law. It exhorts the audience toward an education in virtue from an early age, because “a long time’s familiarity with a thing at length strengthens the practice, while a short time is not able to accomplish this” (Graham 865). Such a life requires self-control, especially an indifference to money, “by which everyone is corrupted” (Graham 867). The love for money is, for most people, merely a symptom of their fear, that is, fear of death, disease, old age, and so forth. These things can presumably be held at bay, so the masses think, by money. Rivalries and competition with others are also motives for greed. Thus, law is needed to ensure that money remains a good for the entire community, and moreover so that the community does not fall into dissolution. Lawlessness and greed beget tyranny. Thus, virtue and law are intimately connected.

The Dissoi Logoi, or Twofold Arguments, is a sophistic exercise in arguing for the relativity of things like good and bad, right and wrong, the just and the unjust, truth and falsity, and so forth. What is good in one situation might be bad in another, or good for one person, but bad for another. For example, “sickness is bad for the sick, but good for the physicians. Further, death is bad for those who die, but good for undertakers and makers of tombs” (Graham 879). The relativity of right and wrong to cultural sensibilities is also emphasized. “For example, it is right among the Spartans for girls to exercise naked and appear in public in clothing without sleeves and blouses; but it is wrong to the Ionians” (Graham 883). Again, “Among the Thracians it is a mark of beauty for girls to have tattoos; for everyone else tattoos are a punishment for a crime” (Graham 883). The problem with cultural relativism is that, when taken to its extreme, we cannot claim that certain activities are universally wrong or right, but only wrong or right relative to each culture. Thus, we may see that the arguments in the text are generally bad, but we have no reason to believe that they were meant to be good. The Dissoi Logoi might be emblematic of sophistical exercises at the time, but not necessarily of the more sophisticated of the Sophists.

11. Conclusion

From Thales to the Sophists, we see much variation in thought, as well as in the style and presentation of those different ways of thinking. Yet, we also see common threads running throughout Presocratic thought. On one hand, there is a tendency to think of the cosmos on its own terms. This new way of thinking often takes its course away from the confines of traditional, theocentric thought. Yet, on the other hand, many of these thinkers reformulated and reconceived God, the gods, and divinity. There is also a push towards ethics and thinking about human affairs and the best sorts of ways for human beings to live. Behind it all—the backdrop, as it were—is a preference for free, rational thought.

12. References and Further Reading

The lists of primary and secondary sources are very abbreviated. The secondary sources are generally accessible for non-specialists, and a good starting place for further research into the Presocratics. Some of these books also have extensive lists of references for further reading.

a. Primary Sources

Diels, Hermann and Walther Kranz. Die Fragmente der Vorsokratiker: Griechisch und Deutsch. Berlin: Weidmannsche Buchhandlung, 1910. Print.
- This is the first and most traditionally used collection of Presocratic fragments and testimonies. This edition has the fragments in Greek with German translations. The book is no longer in print, and while it is often still cited in most scholarship, it is not the work cited in this article.
Graham, Daniel W. The Texts of Early Greek Philosophy: The Complete Fragments and Selected Testimonies of the Major Presocratics. 2 vols. Cambridge: Cambridge University Press, 2010.
- This is the first collection of the Presocratic fragments and testimonies published with the original Greek and its corresponding English translations. It is the work cited in this text. Graham offers a short commentary on the fragments, as well as references for further reading for each thinker. He has organized by topic the fragments for each thinker, and labels the fragments with an F, followed by the number of the fragment. That is how the fragments have been cited in this article. Testimonies, as well as Graham’s commentary, are cited by page numbers.

b. Secondary Sources

Barnes, Jonathan. The Presocratic Philosophers. London and New York: Routledge, 1982.
- A classic work with interpretations of the Presocratics.
Burnet, John. Early Greek Philosophy. London: A&C. Black Ltd., 1930.
- Another classic work with interpretations of the Presocratics.
Long, A.A. ed. The Cambridge Companion to Early Greek Philosophy. Cambridge: Cambridge University Press, 1999.
- A collection of sixteen essays by some of the foremost scholars on Presocratic thought. The essays are generally accessible, but some are more appropriate for specialists in the field.
McKirahan, Richard D. Philosophy Before Socrates: An Introduction with Texts and Commentaries. Indianapolis: Hackett, 1994.
- This is a very good book for non-specialists and specialists alike interested in further commentary on the Presocratics. The book contains most fragments for most thinkers and reasonable explanations and interpretations of each. There is also a helpful chapter at the end of the book on the nomos-phusis debate. The text includes a fairly extensive section for suggestions for further reading.
Vlastos, Gregory. “Ethics and Physics in Democritus.” Philosophical Review, vol. 2, 578-592, 1994.
- This article is technical, but offers insight into the connection between Democritean physics and ethics, and was cited in the current overview.

Author Information

Jacob Graham
Email: jgraham@bridgewater.edu
Bridgewater College
U. S. A.

Armed Humanitarian Intervention

Humanitarian intervention is a use of military force to address extraordinary suffering of people, such as genocide or similar, large-scale violation of basic of human rights, where people’s suffering results from their own government’s actions or failures to act. These interventions are also called “armed interventions,” or “armed humanitarian interventions,” or “humanitarian wars. They are interventions to protect, defend, or rescue other people from gross abuse attributable to their own government. The armed intervention is conducted without the consent of the offending nation. Those intervening militarily are one or more states, or international organizations.

The need to consider and understand the many issues involved in humanitarian interventions have been borne home by the fact that these interventions has become more complex and more common since the 1980s, and because of the consequences of non-intervention, such as in the Rwandan genocide of 1994, in which nearly one million people were killed in less than three months. Humanitarian interventions raise many complex, inter-related issues of international law, international relations, political philosophy, and ethics.

This article considers moral issues of whether or when humanitarian intervention is justified, using just war theory as a framework. Section One addresses general characterizations of humanitarian interventions and commonly discussed cases, as well as some definitional or terminological issues. Section Two examines the question: What humanitarian emergencies rise to a level at which intervention is appropriate? Section Three presents just war theory as a common framework for justifying humanitarian interventions. Section Four considers some other, related issues that may support or challenge armed interventions: international law, state sovereignty, the selectivity problem, political realism, post-colonialist and feminist critiques, and pacifism.

What is a Humanitarian Intervention?
The Threshold Condition for Intervention
Justifying Intervention: Just War Theory
Other Issues and Challenges
References and Further Reading

1. What is a Humanitarian Intervention?

The term ‘humanitarian intervention’ came into common use during the 1990s to describe the use of military force by states or international organizations in response to genocides, “ethnic cleansing,” and other horrors suffered by peoples at the hands of their own governments. But cases of armed interventions are not new. Several times during the nineteenth century European powers intervened militarily in various provinces of the Ottoman Empire to protect Christian enclaves from massacre or oppression (Bass). Following World War II there were many military interventions sometimes dubiously described as ‘humanitarian’, including by the United States in Latin America and France’s 1979 use of military force in its former colony, the Central African Republic. Other cases remain notable foci of scholarly discussion: India’s 1971 military intervention in East Pakistan, now Bangladesh; Vietnam’s 1979 intervention into Cambodia; and in the same year, Tanzania’s intervention into Uganda. Later cases include uses of military force to protect Iraqi Kurds, and interventions in Somalia, Haiti, Liberia, and Sierra Leone, among many others. The 1994 genocide in Rwanda focused attention on the consequences of failing to intgervening, because external military force was not deployed to prevent the killing of nearly 1 million people in just three months of violence.

Philosophic attention to humanitarian interventions is not new. The seventeenth century jurist, Hugo Grotius, is credited with originating the modern conception of armed humanitarian intervention. In his classic work of 1646, The Law of War and Peace, he includes an entire chapter, “On Undertaking War on Behalf of Others,” and writes:

If a tyrant … practices atrocities towards his subjects, which no just man can approve, the right of human social connection is not cut off in such a case …. It would not follow that others may not take up arms for them.

Some argue that the earlier “just war” tradition’s appeals to natural law, in effect, permitted humanitarian interventions. Classic theorists like St. Augustine, Thomas Aquinas, and Vitoria saw a just war as aimed at the justice of punishing wrongdoing by other political leaders, which, some argue, would permit intervening against governments’ mistreating their own people (Johnson). In the nineteenth century John Stuart Mill appealed to the importance of communal self-determination in providing consequentialist arguments limiting armed interventions. In the early 21^st century, Michael Walzer entertained armed interventions as justified responses to acts “that shock the moral conscience of mankind” (Just and Unjust Wars, 107).

A humanitarian intervention is a form of foreign interventionism using military force. Consider this paradigm characterization of humanitarian interventions as:

the threat or use of force across state borders by a state (or group of states) aimed at preventing or ending widespread and grave violations of the fundamental human rights of individuals others than its own citizens, without the permission of the state within whose territory force is applied. (Holzgrefe, 18)

Humanitarian interventions are distinguished from other forms of interfering with another state’s activities, such as humanitarian aid, sanctions of various kinds, altering of diplomatic relationships, monitoring arms treaties or elections of human rights practices, and peace-keeping. A humanitarian intervention does not require the consent of the target state: it is a form of coercion. The government is deemed culpable in the suffering of others that is to be prevented or ended. Those suffering and the target of the rescue effort are not nationals of the intervening states: humanitarian interventions are, as Nicholas Wheeler puts it, about “saving strangers.” Definitions are typically neutral as to whether the intervention is unilateral or multi-lateral and as to whether it is authorized (for example, by the United Nations) or unauthorized. Finally, the interveners’ purpose is rescue, defense, or protection of those who are suffering due to their own government’s actions or failures. The purpose is not conquest, territorial control, support of insurrectionist or secessionist movements, regime change, or constitutional change of government.

Humanitarian intervention vary in terms of motivations of a state in using military force. Some stricter definitions require a purity or primacy of intention in the use of armed force: militarily addressing the suffering of others for reasons of national interest, then, by definition, are not humanitarian interventions. Other definitions attend more to the effects of intervening than to motivations. These definitional disputes involve evaluating actions on behalf of others. The issue, then, may be more a matter of how much normative work is to be done by definition rather than by a separable ethical judgment of the actions themselves. A deontologist like Kant or Aquinas, for example, might maintain that genuine instances of a morally worthy act require a purely humanitarian intention, while a utilitarian like Mill might insist that the motive matters not at all to what the act is or to the act’s morality, but only to our judgment of the actor. Such definitional issues also intersect with doctrines of political realism as explanation of states’ behavior (In IEP, see “Political Realism” and “Interventionism” (sec 3b).). If all state action is explained by national self-interest, typically understood in terms of national security, military or economic power, or material well-being, then, all states’ actions are necessarily motivated by self-interest, actions motivated solely or primarily by humanitarian considerations are precluded, and there are, by these stricter definitions, no genuine examples of humanitarian interventions (see IV.d below).

Another terminological consideration is reflected in the work of the 2001 International Commission on Intervention and State Sovereignty (ICISS), The Responsibility to Protect. The Report title is a preferred term because it avoids militarizing what is a humanitarian action and it avoids the connoted approval of military action by labeling it ‘humanitarian’ (sec. 1.39-1.41). Indeed, these semantic concerns are grounded ultimately in re-conceiving state sovereignty not as a right not to be transgressed by outsiders, but as a duty to protect the people of a state and, if needed, people of other states (sec. 1.35). Many differences of definition about what constitutes a humanitarian intervention reflect varying views about the normative merits and justifications for using force to address the suffering of others at the hands of their own government.

2. The Threshold Condition for Intervention

Even proponents of humanitarian intervention advocate very limited circumstances where such uses of military force are justifiable. In particular, proponents attempt to specify minimum, threshold conditions in terms of the severity, scale, and kinds of human suffering necessary (but not sufficient) to justify intervention. For example, seeing interventions as rescues, Michael Walzer specifies situations of “massive violations of human rights” where “what is at stake is the bare survival or the minimal liberty” of a people (Just and Unjust Wars, 101). The ICISS Report, The Responsibility to Protect, specifies a threshold condition in terms of “large scale loss of life, with genocidal intent or not,” or “large scale ‘ethnic cleansing’, … whether carried out by killing, forced expulsion, acts of terror or rape” (sec 4.19). In a similar vein, Nicholas Wheeler speaks of supreme humanitarian emergencies, where there are “extraordinary acts of killing and brutality” beyond the “abuse of human rights that tragically occurs on a daily basis” and that are of a magnitude and severity that “the only hope of saving lives depends on outsiders coming to the rescue” (34). Common among specifications of threshold conditions are requirements that the most basic of human rights are being violated, that the human suffering is widespread and systematic, and that the government bears some culpability for what is happening to its people. Interventions, then, are justifiable only to address the most egregious violations; the threshold conditions in the target state must be those that, as Walzer put it, “shock the conscience of mankind.”

Specifications of threshold conditions raise several issues. A specification of the conditions of suffering will be inherently vague. How many rights violations or how many horrors or how extraordinary must the violations be in order to satisfy the threshold condition for armed intervention? A second issue involves the invoked notion of basic human rights. It is commonly held in Western ethical literature that all human rights are equally important (See “Human Rights” in IEP). Attention to violations of basic human rights, however, presupposes a hierarchy of such rights for all humans. Allen Buchanan, for example, includes significant civil and political rights as well as “the right to resources for subsistence” in a list of basic human rights (Justice, 129); in The Law of Peoples, John Rawls identifies “a special class of urgent rights” that includes ethnic groups’ right to “security from mass murder and genocide” (79). Some have argued that negative rights (for example, not to be tortured, not to be raped, not to be killed) are more basic and more important than positive rights (for example, to basic welfare requirements of food, clothing, shelter), though, following Henry Shue’s analysis in Basic Rights (Princeton, 1980), not all accept such a distinction or hierarchy. There are disagreements about the extent to which basic human rights are the rights of individuals or may include rights of collectives, or “group rights,” such as rights to collective self-determination, group survival, and cultural integrity. International human rights law introduces yet other hierarchies which may be relevant. By international treaty the right not to be tortured is uniquely absolute; only a few human rights are not to be derogated even if the nation’s survival is at stake, while during declared public emergencies a state can set aside other human rights (International Covenant on Civil and Political Rights, Article 4.1-4.2); and the legal obligations of states to respect civil and political rights are much stronger than for social, economic, or cultural rights (cf. International Covenant on Civil and Political Rights, Article 2, and International Covenant on Social, Economic, and Cultural Rights, Article 2).

Government culpability must satisfy certain threshold conditions. Human suffering and rights violations, even when widespread and systematic, may be perpetrated by the government itself, as, for example, during the Holocaust in Nazi Germany; the late 1970s “killing fields” of Cambodia’s Khmer Rouge regime; or 1990s “ethnic cleansings” conducted by the Bosnian Serbs in the former Yugoslavia. Or government may be complicit, indirectly fostering human rights violations by providing funding, arms, or logistical support to private militias, by coordinating attacks on people through control of the communication infrastructure, or by inciting action through propaganda and other forms of media control. This was the case in the Rwandan genocide of 1994 and during the violent campaigns by the janjaweed in Darfur, Sudan, beginning about 2004. Or state involvement may be more akin to negligence, incompetence, or inability to govern. In inept or failed states, the government does not maintain effective control of territory and people. This often leads to widespread violations of human rights by non-state actors. Somalia was a “failed state” in 1990. In much of the country, people lived in fear of armed militias while the central government could not assert effective control. The United States intervened militarily. The government culpability necessary to satisfy threshold conditions can range from perpetrator to failed state. Furthermore, situations often mask or complicate whether threshold conditions are satisfied. Widespread and systemic human suffering often occurs amidst or accompanying domestic insurrections, counter-insurgency campaigns, revolutions, liberation efforts, partition or secession battles, or civil wars, for example. In Darfur, the government of the Sudan claims to have been conducting a counter-insurgency campaign; the Bosnian War can be seen as part of a secession or partition battle; the Rwandan genocide occurred in the context of a civil war and struggle for power in the country. The challenges here are both epistemic and conceptual. Satisfying threshold conditions of suffering may depend on the specific domestic contexts in which people and government find themselves.

Though most discussions of humanitarian interventions specify threshold conditions in terms of human rights violations, other kinds of characterizations of the relevant human suffering are used by others. There are justificatory implications for these kinds of differences. A characterization in terms of human rights readily suggests deontological justifications for armed interventions. For any genuine right, others are bound by correlative duties. Even if the primary correlative duty for human rights falls on a national government, some argue that the correlative duties of others include obligations to respect, protect, and enforce human rights whenever the primary duty-holder fails to do so (these are sometimes called “default duties”). Thus, armed interventions are justified partially as discharging (default) duties correlated with the human rights that are being violated in the target state. On the other hand, some see armed interventions as aimed at reducing human suffering, regardless of whether there are violations of specific human rights. As mentioned above, the ICISS Report specifies “large scale loss of life” or “large scale ‘ethnic cleansing’.” Such a characterization of threshold conditions suggests a direct consequentialism at work. Some feminists have argued that social oppression of women constitutes threshold conditions for forceful interventions (Cudd). Uses of armed force, of course, have costs for human suffering, too. The idea is that sometimes the use of deadly force is justifiable to save lives and reduce total human suffering.

Another development relevant to interventions is the concept of human security. The notion of multi-faceted security, then, applies not only to states (as in “national security”), but also to people. The concept of human security is defined broadly both in terms of causes and kinds of human suffering. For example, the ICISS Report, The Responsibility to Protect, describes the security of people as

their physical safety, their economic and social well-being, respect for their dignity and worth as human beings, and the protection of their human rights and fundamental freedoms. (sec. 2.21)

Furthermore, valuing human security requires addressing “threats to life, health, livelihood, personal safety and human dignity” without regard to the sources of these threats, whether governmental, man-made, or natural. The United Nations Development Program has adopted a similarly broad definition. With respect to threshold conditions for humanitarian interventions, using the broad concept of human security has some advantages. It eliminates the need to establish target state culpability for its peoples’ suffering; and, in fact, often some government action or inaction at least partially explains even famines, and the effects of droughts or earthquakes. Determining whether threshold conditions are satisfied is also simpler without the need to apply specific legal or moral categories such as basic human rights or genocide. Furthermore, it is argued, the concept calls attention to preventing humanitarian emergencies from emerging, instead of focusing so much on armed interventions as reactions to emergencies. But the breadth and scope of the concept also is challenging for use as a threshold condition for humanitarian interventions. Virtually any kind of widespread, systematic suffering or threat to people becomes a security issue possibly addressed by an armed intervention: many situations around the world thereby satisfy a requisite condition for justifiable intervention. The concept’s breadth erases or postpones justifying priorities, both by states trying to address their own peoples’ needs and by states or organizations readying to rescue those not secure under their own governments. As a threshold condition, the breadth of the concept of human security only make more common problems with properly selecting which of many humanitarian emergencies warrant others’ use of armed force to alleviate human suffering (see IV. C. below). For these and other reasons, the concept of human security is not often invoked in articulating threshold conditions for interventions.

3. Justifying Intervention: Just War Theory

The satisfaction of specified threshold conditions and state culpability requirements are only necessary conditions for morally justifying humanitarian interventions. There is a paradoxical quality in using deadly force to prevent or end violence against others. How can it be that war is warranted in the name of saving lives? A common response employs the “domestic analogy,” seeing states as analogous to persons. As matter of morality and legality, individuals have rights of defense that permit using deadly force as proportionate response to unavoidable, imminent threats to our own lives or to the lives of others, whether the endangered people are kin, akin, or strangers. By analogy, then, states have not only rights of self-defense if attacked, but rights to use deadly force in defense of others. A second analogy also sees states as persons. Given that individuals, in some circumstances, ought to perform beneficent acts such as “interposing to protect the defenseless against ill usage” and “saving a fellow-creature’s life,” to use John Stuart Mill’s phrasing, so states sometimes are right to rescue others being poorly treated under their own government. More direct arguments see a connection between taking universal human rights seriously and acting rightly with deadly force when this force is necessary to defend or protect those rights. Direct consequentialist arguments appeal to the morality of preventing extraordinary suffering when possible, that is, if and when there is opportunity and capability that is not more costly in its effects on human lives than not acting with deadly force. Thus, there need not be inconsistency or paradox in saving lives by using armed force, at least in some grave circumstances.

Discussions of whether humanitarian interventions are justified take seriously both the moral pull of extreme humanitarian emergencies that “shock the conscience of mankind” and a moral reticence about using deadly force even to save lives. Regardless of the kind of moral theory employed – direct or indirect utilitarianism, natural law principles, or correlative duties of human rights, for example – justifying an armed intervention involves addressing a host of questions: Who or what has the authority to intervene? Is an intervention likely to succeed, or be worth the costs, on balance? Are there not non-military measures available to address the human suffering? What exactly is the purpose of the military action and how are armed forces to conduct themselves in defending, protecting, or rescuing others from their own government? Such questions are, in fact, paralleled in the structure of just war theory, or jus bellum, and its traditional duality: jus ad bellum, or the conditions requisite for justifiably going to war, and jus in bello, the principles governing proper conduct of war. Just war theory—especially jus ad bellum—is the framework for making moral decisions about humanitarian interventions. For example, in Saving Strangers, Nicholas Wheeler says “requirements that an intervention must meet… are derived from the Just War tradition” (33-34). The 2001 ICISS Report, The Responsibility to Protect, summarizes “criteria for military intervention… under the following six headings: right authority, just cause, right intention, last resort, proportional means and reasonable prospects” (sec. 4.15-4.16ff.). Michael Walzer defends interventions in his classic work, Just and Unjust Wars, and again prominently in the Preface to the Third Edition of that book. Many critics challenge the suitability, adaptations, and implications of just war theory for humanitarian interventions. So, proponents, opponents, and cautionary discussants employ just war theory in exploring the moral merits of humanitarian interventions.

There are additional reasons for relying on the jus bellum framework. Humanitarian interventions resemble wars, are even sometimes referred to as “humanitarian wars.” Military force is used in another nation’s territory in order to rescue, protect, or defend people. The most basic moral question of modern just war theory is delineating what states are permitted to do through the use of military force to those outside their borders and for achieving what aims or purposes. Second, the classic just war tradition includes attention to what are now called humanitarian interventions, at least as far as the cause and purpose of such military action. Morally justifying humanitarian interventions, then, is often explored by interpreting, applying or adapting the standards for judging whether going to war is justified; receiving the most attention are issues of just cause and right authority for interventions. Other major facets of just war theory and its tradition – jus in bello, and jus post bellum – are also employed, though less prominently, as there has been much less philosophic attention to the conduct of interventions or what follows the use of armed force to rescue, protect, or defend others.

a. Justifying the Recourse to War (jus ad bellum) and Interventions

The jus ad bellum framework of just war theory identifies about a half dozen considerations relevant to justifying the recourse to war. All the ad bellum requirements must be satisfied for war to be justified. So, the use of armed force for humanitarian purposes is justified only if all six ad bellum requirements are satisfied. Three of these considerations – last resort, likelihood of success, and proportionality – are consequentialist requirements. Proportionality, for example, requires that the benefits of military action are not overshadowed by the inevitable costs, destruction, and other negative effects. Likelihood of success involves estimating the consequences of waging war, specifically, the probability that the war’s aims will be accomplished. Last resort captures the idea that war is worth its effects only if non-military means are not available for success: recourse to war is justifiable only if alternative, pacific courses of action will not achieve the morally acceptable aims of war (tied to a “just cause”). The other three jus ad bellum considerations – just cause, right authority, right intention – appear to be deontological, rooted in natural law, for example, human rights, or other normative, non-consequentialist principles. Pivotal among jus ad bellum considerations is the notion of “just cause” for war. Adapting the jus bellum framework to humanitarian interventions brings a mixture of deontological and consequentialist reasoning to the issues, with the satisfactions of threshold conditions – a “just cause” – being central to justifying the use of armed force to address human suffering.

i. Just Cause

In the just war tradition, just cause has long been among the basic considerations in determining whether the recourse to military force is justified. St. Thomas Aquinas’s famous first articulation prominently includes “just cause” as a requirement, as do virtually all subsequent contributors to the tradition. The idea is that certain circumstances rightly prompt and contribute significantly to a justification for a war. Furthermore, the just war tradition, just war theory, and international law today acknowledge that armed attack by another justifies going to war: wars of reactive self-defense clearly satisfy the “just cause” requirement. As applied to humanitarian interventions, then, the issue is whether a “just cause” includes defense of others, or as many state it, whether threshold conditions for intervention are a “just cause” for a state or states using armed force to rescue, protect, or defend other people.

Supporters of justifiable interventions call attention to features of the just war tradition. As noted in sec. I, Hugo Grotius explicitly acknowledges that a government’s subjects suffering atrocities permits others “take up arms for them.” A dominant theme of the classic just war tradition is that punishable wrongs are a just cause for war, even if the intervening party has not been wronged. James Turner Johnson, for example, suggests that traditional just war theory is not based on a presumption against war, but on a presumption against injustice: a just war is not only a justified war, it is a war waged for justice. The interpretative contention, then, is that only in the early 21^st century has just war thinking come to be so restrictive about “just cause” as to allow only for wars of self-defense.

Aside from interpretations of the just war tradition, a number of fundamental issues are at stake in debates about the substantive content of the “just cause” requirement as it pertains to humanitarian interventions. One matter deals with the kind of moral foundations presupposed for just war theory itself. Some appeal to transnational ethical norms about rights or duties, whether expressed as universal natural law principles about rights of defense or duties correlated with universal human rights. So, sometimes also coupled with the “domestic analogy” between persons and states described above, the ethical arguments turn on whether there is a natural law duty to rescue or render aid, a (default) duty to enforce human rights, or a transnational right to defend others conjoined to the uncontroversial self-defense right of states. Discussions often challenge the adequacy of the analogies: states seem much unlike individuals when it comes to ethical norms. Also, legal positivists, especially, find the appeal to natural law more than suspect, and positive international law is explicit only about states’ right to self-defense (see section IV.A. below). Others discuss the “just cause” requirement by invoking different conceptions of the world community. At one extreme, the world community is inter-national, a community of nations or sovereign states relating to one another by mutual agreement with one another; an opposing conception thinks of a global ethical order of trans-national norms about people (that is, human rights). In effect, some of the debate is couched in broad issues of how state-centered or people-centered the world ethical order is to be.

ii. Right Intention and Right Authority

Intention, or purpose, and authority have both been basic considerations in determining whether the recourse to military force is justified. Aquinas’s first articulation of just war theory includes “right intention” and “right authority” as requirements. Matters of intention, or purpose, have since not always been accorded independent status. For example, Grotius does not list intention as a separate requirement for justly going to war, and later versions of just war theory often seem to conflate “right intention” and “just cause.” In the application of just war theory to humanitarian interventions, however, the “right intention” requirement figures prominently in many discussions. As noted in sec. I, the issue emerges as a matter of definition, and some maintain purity of motive is essential to being a humanitarian intervention. Others note that the classic just war tradition mostly excludes certain aims or purposes for going to war – “not out of greed or cruelty, but for the sake of peace, to restrain the evildoers and assist the good,” writes Aquinas. So, the classic “right intention” requirement, it is argued, allows for a plurality of motives for waging war so long as excluded are such aims as conquest, territory, control of natural resources, and vengeance. When applied to humanitarian interventions, then, use of military force satisfies the “right intention” requirement if, among a plurality of motives, a primary purpose is addressing the widespread and systematic human suffering (Wheeler, 37-40).

The classic just war tradition emphasizes issues about the locus of authority to deploy military force. Modern just war theory typically presumes that states are the proper “right authority.” The advent of non-state war-makers – terrorist organizations, liberation movements, insurgencies, and insurrections, for example – raises interesting questions for this jus ad bellum issue. In applying the just war theory framework to humanitarian interventions, however, the “right authority” requirement is prominently discussed, often in the context of international law and institutions. Under the United Nations Charter, except for wars of reactive self-defense, a state is explicitly permitted to employ military force against another only with Security Council authorization to “preserve international peace and security” (see IV.A. below). For this and other reasons much discussion is devoted to whether interventions are justifiable when unauthorized by the United Nations (and thus, illegal).

There also are some ethical dimensions to “right authority” questions about humanitarian interventions. In as much as impartiality is an ethical norm, there may be a strong presumption for only centrally authorized or multi-lateral interventions being justifiable. In as much as speed of response to a supreme humanitarian emergency saves more lives and a single state can be more decisive, there is support for permitting unilateral and unauthorized interventions. In as much as there are moral objections to the current, restrictive international law of force, states’ or international organizations’ unauthorized interventions may have some moral merit as a way of reforming the law or as a justified cost of promoting basic justice or protecting basic human rights. In as much as the quality of the intervention is affected by characteristics of the intervening party – suitable military capability, quality command and control infrastructures, experience, a good human rights record – perhaps only certain states or organizations satisfy the “right authority” requirement for justifiable humanitarian interventions.

iii. Likelihood of Success, Last Resort, and Proportionality

Three additional jus ad bellum requirements also must be satisfied for a war to be justified. As applied to justifying an armed intervention, then, using military force to address a humanitarian emergency must be likely to succeed. As in just war theory, the principle of likely success presupposes a sufficiently clear understanding of “just cause” and “right intention.” For humanitarian interventions, then, success is at least preventing or stopping the widespread violence and suffering that constitutes “just cause” and defines the purpose of the incursion. If such success is not likely, then the intervention is not justified. Aside from the inherent vagueness of the standard, estimating the likelihood of a successful intervention is complicated, a function of at least two general factors (among others): the military capabilities and effectiveness of the intervener, and the capabilities of the target state or other forces involved in the violence that constitutes “just cause.” The latter can be further complicated by the need to estimate secondary effects: for example, whether an armed intervention may invoke target state allies’ military mobilization and responses, with looming possibilities of a larger conflict. As even proponents of intervention admit, some interventions will not succeed and “some human beings simply cannot be rescued except at an unacceptable cost” (International Commission, sec. 4.41). It also follows that inequality of military power among states is normatively significant. A humanitarian intervention is not likely to succeed against large, powerful states, like China or Russia, while success is more likely for emergencies occurring in smaller, weaker nations; furthermore, large, militarily powerful states are more likely to be successful interveners than smaller, weaker nations or organizations. Thus, inequalities of states’ military power create inequalities of immunity and vulnerability to justifiable armed interventions; and power differentials create inequalities of moral right, responsibility, or duty to intervene in response to human suffering around the world.

The “last resort” requirement expresses the general idea that war is worth its deadly, destructive effects only if every non-military alternative will not work to achieve the same ends (that is, what counts as success, which is linked to “just cause” and “right intention”). Though the general idea is more than plausible, specifying the “last resort” requirement precisely is controversial because one can almost always argue that not all non-military avenues have failed: more diplomacy or negotiation almost always seems possible. Indeed, on a literal construal of the requirement, no war ever satisfies this jus ad bellum requirement. On the other hand, in law and morality, reactive wars of self-defense are justified, even though non-military means of resolution, in fact, are not attempted after an armed attack occurs. So, justifying a war as a last resort depends on at least two features of the specific situation: time, or urgency of action, and likelihood that non-military measures would succeed.

Supreme humanitarian emergencies often exhibit urgency akin to that of a state facing a surprise attack or invasion: if lives are to be saved and people to be rescued, there is not time for peaceful pressure, coercion, diplomacy, negotiation, sanctions, or boycotts to work effectively. The Rwandan genocide of 1994 vividly illustrated this sort of urgency. Aside from temporal considerations, intervening as a last resort involves assessing the likelihood of any non-military means being effective, but not necessarily actually implementing or trying all those means. This leads to typically counterfactual formulations of the “last resort” requirement, as illustrated by the ICISS Report, The Responsibility to Protect.

[Last Resort] … does not necessarily mean that every [non-military] option must literally have been tried and failed…. But it does mean that there must be reasonable grounds for believing that … if the measure had been attempted it would not have succeeded. (sec. 4.37)

This way of proceeding points to a second temporal feature of war as last resort. Some opponents of intervention bemoan the lack of infrastructure, for example, that would enrich and support effective, non-military means of defending and protecting basic human rights. The idea is that more could have been done to prevent horrors and to be ready to react non-militarily when emergencies do emerge. An issue here, then, is the time framework for constructing possible means of addressing the emergency. A circumscribed last resort principle requires assessing the effectiveness of those means available at the time of the emergency, however rich or limited they may be. A broader last resort principle seems to deny armed force is a last resort if more could have been done in the past to enrich the availability of effective non-military means today. This broader construal, however, seems to conflate a call to build better prevention mechanisms with assessing military and non-military options available when supreme humanitarian emergencies actually occur and decisions have to be made.

The jus ad bellum proportionality requirement is often labeled “(macro-)proportionality,” to distinguish it from the in bello proportionality, or (micro-)proportionality principle. The ad bellum principle addresses the general concern that the deaths, destruction, and other negative effects of war must be balanced by its benefits (that is, success). In considering war’s effects the proportionality principle precludes excessive partiality. So, a war’s effects on everyone are to be counted – civilians and combatants, whether friend or foe or neutrals civilians. All death, injury, and destruction are to be considered, and relevant effects must not be limited to one’s own national interest and do include the international community. This breadth of considerations brings to the fore difficult matters of the commensurablity of values and, as for any consequentialist argument, epistemic challenges related to the causal impacts of action. Yet some rough estimates of wars’ costs and benefits can be and have been plausibly made. But a few thousand armed soldiers quickly deployed to Rwanda in April, 1994, would likely have saved many, many lives, whereas militarily stopping the suffering of Chechnyans or Tibetans would very likely bring exorbitant costs in death and destruction. In other cases the benefits of an armed intervention includes rescues of suffering peoples, a cost might be significant eroding of the stability and order of a system of states on the planet. The idea is that vigilante justice by state militaries has costs to the international system’s order, stability, and peace, costs that are not balanced adequately by the reduced suffering of people in a particular nation or region. Michael Walzer, like other just war theorists, concludes that the proportionality requirement “… is a gross truth, and while it will do some work in [some] cases …, it isn’t going to make for useful discriminations in the greater number of cases” (Arguing 90). Contributing to the challenges for the proportionality requirements are controversies about its structure: for example, whether an adequate macro-proportionality requires minimizing the bad effects, maximizing the net benefits, or minimizing the balance of benefits over bad effects. Applied to justifying armed interventions, then, the macro-proportionality requirement speaks to a central concern, but cannot reliably discriminate finely among many humanitarian emergencies that arise.

b. Justifying Conduct in War (jus in bellum) and Justice after War (jus post bellum)

The general idea of proportionality is one that links the traditional division of just war theory into jus ad bellum and jus in bello principles. The latter just war requirements govern how a war is to be conducted: proportional means are to be used, and non-combatants are to be distinguished from combatants in waging war. Given the just war theory framework for justifying humanitarian interventions, these in bello considerations are relevant and applicable to uses of military force to address humanitarian emergencies. If an armed intervention is be a (fully) just war, then, the rules of engagement (ROEs) need to reflect both in bello principles. These principles raise many issues for just war theory and some challenging ones for the morality of interventions.

The in bello micro-proportionality requirement governs military operations during a war. The general idea is to minimize the armed force used, and destruction caused, in order to attain a militarily necessary objective. But unlike the ad bellum the macro-proportionality requirement, in assessing the effects of a military operation, it matters much who benefits or suffers. First, combatant and non-combatants are to be distinguished. This is the in bello principle of discrimination. As Michael Walzer expresses it, the general idea is that wars are waged between combatants: non-combatants “are not currently engaged in the business of war” and thereby are “outside the permissible range of warfare” and carry an “immunity from attack” very much unlike combatants. Though being more specific about the distinguishing criteria or about the permissibility of some non-combatant casualties (that is, as “collateral damage”) is controversial and complicated (See much of Walzer’s Just and Unjust Wars, for example), in estimating the consequences of a military operation in war, one is to count much more any ill effects on non-combatants: who suffers matters much. Second, in war and interventions it is permissible to count more the costs to one’s own forces than losses to opposing combatants. The notion of “force protection” becomes morally acceptable, at least within some limits: the conduct of the war is justifiable even if operations distribute risks more to opponents’ forces than to one’s own forces, provided there are “no more casualties than necessary inflicted on the other side.” But third, those force protection limits for intervening forces can become problematic. As illustrated by the Kosovo intervention of the 1990s, high altitude bombing effectively reduces the interveners’ losses while also increasing the costs to non-combatants on the ground. How much “collateral damage” to non-combatants (and non-military property) is morally permissible in order to reduce risks to one’s own forces? How much death and damage to opponents’ military force is not excessive and is morally permissible in order to achieve humanitarian ends?

The traditional in bello requirements of just war theory leads to challenges for this approach to the morality of humanitarian interventions. George Lucas, for example, argues that “the use of military force in humanitarian cases is far closer to the use of force in domestic law enforcement” than it is to waging war. Interveners are there to protect and defend, akin to the mission of a police force. Seeing this more constabulary role of intervening forces entails that “international military ‘police-like’ forces (like actual police forces) must incur considerable additional risk, even from suspected guilty parties,” while, like domestic police forces, “refrain from excessive collateral damage, … the deliberate targeting of non-combatants, … [and] engaging in violation of the law.” These are “far more stringent restrictions in certain respects than traditional jus in bello” requirements. Indeed, these stringent restrictions apply even if interventions are seen as “saving strangers” and the mission seen as a rescue. Thus, Lucas concludes, “the attempt to assimilate or subsume humanitarian uses of military force under traditional just war criteria fails.” Interventions “are sufficiently unique as to demand their own form of justification, … jus in pace, or jus in intervention,” and specific, substantive requirements for interventions are proposed in a structure parallel to the traditional just war framework of jus ad bellum and jus in bello principles.

A third major facet of just war theory, jus post bellum – literally “after war” – is also sometimes a framework for examining the morality of humanitarian interventions. The 18^th century work of Immanuel Kant, in Perpetual Peace and elsewhere, is often credited with originating the notion of jus post bellum, though Vitoria and Suarez both earlier distinguish this facet of just war theory. The roots of the notion are embedded in the classic “right intention” requirement that the aim of justifiable war is to be peace. How one ends a war – even a just one – affects whether peace will follow, for how long, and the structure of the peace that will or should be. For example, is a just end of war establishment of the status quo ante bellum, which is perhaps plausible for wars of self-defense against an invasion? Or ought a war end by establishing “peace with justice?” And what might such justice require – unconditional surrender, reparations, repatriations, disarmament, punishment of perpetrators, structural adjustments in the distributions of land or wealth, establishment of democracy, restorations of relationships? In international law and among just war theorists, this third major component of just war theory has received comparatively little attention. (One important exception is the work of Brian Orend.) Jus post bellum issues are important to the morality of humanitarian interventions. As C.A.J. Coady cautions, considering interventions requires specifying not only from what it is people are to be rescued, but also for what it is that they are rescued (in Chatterjee & Scheid, 291).

Jus post bellum considerations lead to tensions and challenges for thinking about the purposes and morality of humanitarian interventions. For example, given the nature of supreme humanitarian emergencies, stopping the violence leaves a great need for extremely difficult reconciliation processes, a facet of rebuilding a functioning social order. In addition, to prevent recurring violence the root causes may need to be identified and addressed, which likely involves major changes in a society’s basic structure and institutions. Perhaps justice requires some punitive action towards perpetrators and accomplices, whether heads of state, government officials, or local militia leaders and private citizens. Arrests, war crimes trials, truth commissions, and the like may be warranted for what is sometimes called “transitional justice.” A concern is that seeking retributive justice can counter needs for reconciliation and matters of restorative justice, as can redistributions of wealth, land or political power to address root causes of the violence (Govier, Orend). The ICISS Report, The Responsibility to Protect, identifies such issues as elements of “the responsibility to rebuild.” A long-term aim of genuine peace, then, generates complex questions of how such peace relates to other important aims, such as justice.

These kinds of post bellum considerations effectively broaden and perhaps challenge just war thinking about the morality of humanitarian interventions. For example, if the ad bellum success requirement is more than rescuing, defending, or protecting victims, but also includes justice (retributive, distributive, restorative) or rebuilding, then the challenges of success are much greater, the likelihood of success is much less, the capabilities for success (including political will) are rarely available. It would follow that virtually no interventions are justifiable by just war standards: more often many more people will be beyond rescue than what follows from narrower understandings of what a successful intervention entails. Second, these broadening, post bellum considerations challenge the very conception of interventions as rescues, as “saving strangers.” It makes interventions perhaps more akin to a police action, with attention to arrest and enforcement, or more like a mission to establish peace with justice, or more like a complex, long-term humanitarian aid program of which one significant dimension is the use of armed force. On the other hand, one can look at the responsibility to rebuild or seek justice post bellum as a distinct phase following the humanitarian intervention proper. A fully just use of military force – even seen as rescue, protection, or defense – may require that some organization or states address post bellum issues and rebuilding once the violence is ended, but those needs need not be addressed by the interveners themselves. Just war thinking then requires that interveners use military force with consideration of post bellum requirements, but the post-intervention missions need not be the action of the rescuer themselves. Some such distribution of responsibilities may be, for example, what is envisioned by the ICISS Report, The Responsibility to Protect.

c. Some Implications of Justifying Humanitarian Intervention

Just war theory, in its entirety, articulates appropriately high standards for morally judging war and for justifying humanitarian interventions. Even the ad bellum standards more frequently addressed are not all easily satisfied (a point sometimes insufficiently appreciated due to the excessive focus on threshold conditions, or “just cause”). There are some significant implications of the just war framework for assessing the moral justifications for humanitarian interventions. There are, for example, daunting epistemic issues in establishing that threshold conditions are satisfied or in assessing the complex consequences of an intervention. The latter is only aggravated by the near certainty of unintended consequences for military campaigns and by the frequent situation of estimating effects of using armed force in foreign lands and cultures. As already noted, given the inequalities of military capability among states that affect interveners’ likelihood of success, just war thinking leaves some nations much more vulnerable to interventions for mistreating their own people, while other states can violate human rights with impunity from others’ use of military force to stop the violence. The same inequalities of capability result in an unequal distribution of the right or responsibility to intervene militarily on humanitarian grounds, with all the attendant costs of such interventions.

There is a basic deontic category issue in exploring the moral merits of humanitarian interventions via just war theory. Is a justified war a matter of a right, responsibility, or duty? And what kind of right or duty is signaled by establishing that a war is justified? Parallel questions then apply to justified humanitarian interventions: are they a matter of a right to use armed force, or a responsibility or duty of some kind? What kind of right or duty, then? Addressing such questions from a just war framework intersects with varying conceptions or analogies employed in discussing interventions. For example, one might consider interventions in defense of others as a right associated with rights of self-defense. Associated with individuals’ right of self-defense is a right to use deadly force in defense of others. In parallel fashion, then, associated with states’ right to wage wars of self-defense would be a right to use military force in defense of others. Such a right of defense – of self or of others – is one that the right-holder chooses whether to exercise or not: just as there is no duty to fight in self-defense, then, there is no duty to use deadly force in defense of others. Humanitarian interventions, then, are a matter of moral right, not duty or obligation; and they are what are called liberty-rights or discretionary rights of intervention. In as much as jus ad bellum principles identify when there is such a right to wage war, then they can be used to identify when there is a moral right to intervene militarily for humanitarian purposes.

In contrast, armed interventions are often portrayed in ways suggesting there is a duty to use military force to address humanitarian emergencies. A common conception is the notion of interventions as rescuing others. The “rescue” metaphor suggests using military force is an imperfect duty of individual beneficence or charity, of rendering aid to strangers facing life-threatening situations, or what are sometimes called “Good Samaritan” duties. Such imperfect duties are not correlated with others having a right to be rescued, and wide discretion is accorded the obligated as to when, where, and how to discharge the duty. As we have seen, though, the “Good Samaritan” analogy may be a strained one, at best: states or international organizations’ interventions are not relevantly or sufficiently akin to that of “saving strangers” as if a tragic accident had befallen them. It has also been suggested that interveners are more akin to a police force, which suggests that justified interventions are discharging a duty to protect and defend others in grave danger. A humanitarian intervention, it would seem, is justified under conditions analogous to those for domestically dispatching S.W.A.T. teams. A challenge for either of these conceptions of interventions – as rescue or as constabulary – is a dissonance between a moral duty to use military force for humanitarian purposes and the kind of moral justification for waging war the jus ad bellum principles provide. Does just war theory establish a moral duty to wage war? If not, how can jus ad bellum principles ever support a duty to intervene militarily? To speak of a moral duty to wage war is today not obviously plausible. The notion of a duty to wage war may be consistent with some classic contributors to the just war tradition, such as Aquinas. Late 20^th century theorists, like Michael Walzer, argue that sometimes there is a duty literally to combat evil. But the idea that just war theory establishes duties to wage some wars is controversial and defended by some for only quite unusual circumstances, of which, of course, a supreme humanitarian emergency may be one. As some have argued, jus bellum can establish at most the moral permissibility or right of intervening; additional considerations are needed to establish humanitarian interventions as morally obligatory.

Other proponents of humanitarian interventions argue for a duty to intervene based on taking human rights seriously, as a duty correlated with people’s basic human rights not to be tortured or killed arbitrarily. The general idea is that, as moral claim-rights held by all, human rights entail everyone has a duty to protect and promote the human rights of everyone else (See IEP article, “Human Rights,” sec 3). One form of the argument is that, as a matter of international law, practice, and practicality, these correlative obligations fall largely upon national governments and international organizations (for example, the United Nations). Others argue more forcefully that the logic of basic human rights establishes correlative duties to respect, protect, and defend. In Basic Rights (Princeton, 1980), Henry Shue famously argues that a basic right such as the right not be killed arbitrarily entails not only duties not to kill, but duties to protect or enforce the right: a negative right such as the right not to be killed arbitrarily requires positive action by others (as do positive rights to subsistence). Furthermore, Shue argues later, if and when the primary holder of the correlative duties (that is, the state) fails to meet its obligations, then the duty to protect and defend human rights defaults to others. Thus, humanitarian interventions are justifiable as discharging a (default) duty to protect and defend basic human rights not being respected by the target state. At least in its conclusion, the ICISS Report, The Responsibility to Protect, advances a similar view about interventions. Another, related argument to support a duty to intervene derives from theories of global distributive justice. For example, Allen Buchanan argues that a natural duty of justice obligates all to do what we can “to help create structures that provide all persons with access to just institutions, … where this means primarily institutions that protect human rights” (Justice, 85ff.). Arguments appealing to rights and correlativity relations are not uncontroversial. For those who distinguish positive and negative rights, for example, the correlative duty for a right to life is simply and only not to kill. Thus, so long as a state or international organization is not the perpetrator of atrocities against its own people, it would seem that correlative obligations have been satisfied without coming close to establishing military intervention as a duty. An issue in the background is what one takes to be the model for understanding human rights and the extent to which duties of respect and protection correlate with those rights and for whom or what.

4. Other Issues and Challenges

Just war theory has been the most prominent framework for philosophic discussion of the morality of humanitarian interventions. Other relevant approaches include attention to international law and its ethical implications and an issue central to political philosophy, the concept of state sovereignty. Among the most powerful and prominent objections to interventions are those based on state sovereignty and on what is called “the selectivity problem.” Some alternative frameworks for considering humanitarian interventions are actually challenges to just war theory itself. Political realisms deny the applicability of moral norms to state behavior, including uses of military force. Pacifism typically denies the premise of just war theory, namely, that some wars are morally justifiable, even if waged for humanitarian purposes.

a. International Law and Ethics

Much discussion of humanitarian interventions involves legal issues under the Charter of the United Nations, the central and paramount text of the international law of force. Philosophers of law have accorded relatively little attention to international law. Questions about the legality of interventions, however, exhibit significant philosophic and ethical dimensions, even setting aside here many matters of analytic jurisprudence and whether international law constitutes a genuine legal system, whether there is such a thing as international law at all (See IEP article, “Philosophy of Law”). Attending to the international law of force and human rights involves issues of interpretation, sources of law, ethics of acting illegally and reform, as well as the extent to which states or people ought to be at the center of the system.

At the center of the international law about interventions are explicit provisions of the United Nations Charter and human rights treaties. Proclaiming “the sovereign equality of all states,” the Charter permits states to use armed force only in self-defense, prohibits states’ “threat or use of force against the territorial integrity or political independence of any state,” prohibits intervening “in matters which are essentially within the domestic jurisdiction of any state,” and allows the Security Council to authorize uses of armed force only if domestic strife or brutalities also constitute “threats to international peace and security” (Articles 2.1, 51, 2.4, 2.7, and 39, respectively). The text seems unequivocally clear: unauthorized humanitarian interventions are illegal. And since humanitarian emergencies typically do not threaten international peace and security, the text permits authorizing few, if any, interventions. Furthermore, the nine core human rights treaties and the 1948 Universal Declaration of Human Rights explicitly require only that each state respect, protect, and enforce the provisions listed, such as rights to life. The 1948 Genocide Convention requires that signatories “prevent and punish” the “crime of genocide,” but the only explicitly permissible means is via “the competent organs of the United Nations.” The 1998 International Criminal Court statute (called the “Rome statute”) makes its authority largely dependent upon and only complementary to individual states’ enforcements. Human rights, for example, may be transnational norms, but international law makes the respect, defense, and protection of those rights almost exclusively a domestic matter for each state. So, to promote international peace and security, inter-state uses of armed force are severely limited by law, even when domestic violence against people may be widespread and systematic.

There are ethical dimensions to the system of international law as it relates to interventions. The United Nations Charter has been accepted by consent of all and each members of the United Nations – virtually every state on the planet. State consent is among the established procedures for creating international law and consent creates compliance obligations for states. Legal positivists hold that there are (ethical) obligations to obey laws enacted according to established procedures (See IEP article, “Legal Positivism”). So, for positivists, it follows that states and international organizations ought to obey the law and therefore ought not conduct unauthorized humanitarian interventions. Though each state ought to respect, protect, and enforce human rights, relevant international law texts do not provide a legal basis for unauthorized or for authorizing interventions, even as a way to stop a state’s violating its own treaty obligations by mistreating its own people. Legal positivists maintain that there are legal and moral obligations not to interfere militarily with the domestic affairs of states, even in the face of a humanitarian emergency.

Challenges to this line of reasoning take several forms. First, some legal scholars quite carefully parse the specific Charter texts in ways consistent with humanitarian interventions being permitted. For example, Article 2(4) does not prohibit all uses of military force, but only those aimed at the independence or territory of another state. Legally permissible, then, would be any humanitarian interventions having neither those aims nor those effects. Disagreements about interpretation raise philosophic issues about how best or properly to interpret legal texts. Some dispute such textual parsing as ignoring the original intent of the language. Others deny that original intent is probative, granting a more significant role for contemporary attitudes, beliefs, and norms about interventions, or appealing to a political morality implicit in legal texts and their interpretive history. A second area of disagreement about legalities attends to considerations of non-textual sources of law, or what is called “customary international law.” Analogous to common law in domestic legal systems, general state practice accepted as law is evidence of a rule of customary international law (Article 38(I), Statute of the International Court of Justice). Some argue that long-standing state practice has established a customary right of humanitarian intervention; others deny this claim of fact about state practice, or assert that the written law of the Charter supersedes any putative customary rule. In effect, there is much controversy about what H. L. A. Hart famously has called “a rule of recognition” for the system of international law, especially customary law. One final issue deals with the ethics of acting illegally. At the heart of creating customary law about interventions is establishing a state practice of intervening. This requires that states begin creating a custom by acting in ways neither required nor permitted by international law at the time: legality is created over time only by a process initially requiring illegal actions. Given sufficient moral grounds for reforms to permit humanitarian interventions, then, a moral argument can be made for illegally intervening now to address emergencies and thereby contribute to reforming international law. Unauthorized humanitarian interventions then can be seen as a kind of international civil disobedience by states or international organizations.

b. State Sovereignty and Intervention

State sovereignty is a major issue for humanitarian interventions, whether as source of opposition or of significant challenge for proponents. For centuries the general idea has been that a sovereign state has supreme authority over its territory, its people, and its relation with other states; and so, other states or organizations are not to interfere with exercises of that supreme authority. Matters of sovereignty have been central to political philosophy, international relations, international law, and the institutions and practices constitutive of the modern world order. Prima facie humanitarian interventions challenge state sovereignty and the international system of non-interference in states’ domestic authority. The literature is vast, the issues complex, the notion of sovereignty contentious and controversial to the core.

The 16^th century French thinker, Jean Bodin, in his Six Books of the Commonwealth (1576), is credited with coining the term ‘sovereignty’ to denote a state’s supremacy of authority within a territory and population. Subsequent political philosophers, like Locke, Hobbes, Rousseau, and the utilitarians, have focused much on the source, locus, and limits of sovereignty within a state, while merely acknowledging an accompanying externally directed authority to make war, peace, alliances, and treaties with other powers. The idea of sovereignty as independence from others’ interference is tied originally to the 1648 Treaty (or Peace) of Westphalia and develops into a strong principle of non-interference during the 19^th century. Simply stated, then, state sovereignty involves supremacy and independence of authority with respect to internal matters and with respect to relationships with other powers, including the absence of non-consensual interference by other sovereign states or other organizations. The “Westphalian system” is an order of mutually independent states excluded from interfering in one another’s domestic affairs.

Whether authority is seen as effective control or as a right, the merits of sovereignty as independence are mixed. For example, state sovereignty can express and protect a people’s collective right of self-determination in matters political, social, and cultural. A plurality of independent sovereign states accords appropriate diversities among peoples of the earth; a system of non-interference promotes an international stability and order. The sovereignty of states is sometimes portrayed as akin to that of individual persons, coupling autonomy rights over their own life, independence from external control, and mutual, reciprocal duties not to interfere with others. Also, state sovereignty is long embedded in international law. As mentioned above, the United Nations Charter, for example, asserts the equality, independence, and freedom from external interventions in states’ domestic affairs. In contrast, it is argued, taken too far, sovereignty precludes any international law at all, since supremacy and independence is reduced by any transnational legal rules limiting war or breach of treaties, for example. Similar reasoning leads to concerns that sovereignty precludes appeals to transnational moral norms, such as, for example, natural law duties or universal human rights. Some argue that state sovereignty is not limited by, but literally constituted by international law: there is no sovereignty outside the legal system that constructs it and, thus, the contours of state authority change as the content of international law changes. For example, current international law prohibiting torture, genocide, or disregard for basic human rights effectively redefines the scope of authority accorded states: such acts are not expressions of sovereignty, but abuses.

An often neglected line of argument shows that states themselves express their sovereignty sometimes in order to limit the scope of their own sovereignty – what S.I. Benn long ago called “auto-limitation.” Robert Keohane makes the same point: “…[S]overeignty is quite consistent with specific restraints. Indeed, a key attribute of sovereignty is the ability to enter into international agreements that constrain a state’s legal freedom of action” (in Holzgrefe, 283-284). On almost any account, state sovereignty includes the right to enter into treaties, just as personal autonomy rights can be expressed by making promises or signing contracts that obligate, bind, and limit future actions. If states have freely chosen to sign human rights treaties, for example, or the Genocide Convention, or the United Nations Charter, then, through that expression of their sovereign authority, they have limited what is within their authority later, for example, committing genocide, waging aggressive war, disregarding the outcomes of established procedures or processes. Then states involved in humanitarian emergencies are abusing, not exercising the sovereign authority they chose to limit. Though more controversial and problematic, such “auto-limitation” may also apply to outcomes of procedures yet to be implemented. So, for example, since provisions of the Charter allow for humanitarian interventions by Security Council authorization, any member state’s sovereignty is not violated by duly authorized outsiders using armed force to rescue, defend, or protect that state’s people from abuse.

Discussions of humanitarian intervention have led to alternative ways of thinking about state sovereignty. One line of thinking makes sovereignty of a state conditional and contingent (Holder, 89-96). A state has genuine sovereignty only if it meets minimal moral requirements, such as effective control in maintaining order and security, or avoiding egregious mistreatment of its people, or, less minimally, reflects the political will of the people themselves. So, failed or grossly abusive states, for example, have no sovereignty and, thus, an otherwise justified humanitarian intervention does not violate the target state’s sovereignty. Robert Keohane has proposed that sovereignty rights need to be seen as separable, so that a state, based on certain criteria, retains some kinds of authority while losing others. Sovereignty rights can be “unbundled” and they admit of gradations. So, exclusion of external control over territory may be an authority lost by a state, but that same state may continue to have some limited domestic authority at the same time. Third, is a proposal to see sovereignty as states’ responsibility, a kind of duty to protect all people’s human rights. As described in The Responsibility to Protect, states failing to protect their own citizens’ rights temporarily forfeit sovereignty rights, others’ duties of non-interference are suspended, and then other states or organizations assume the responsibility to protect persons by intervening, perhaps even militarily. That sovereignty includes domestic duties to respect and protect peoples’ rights is a feature of classic social contract theories of state authority, such as by Locke, Kant, or even Hobbes. The proposal, though, controversially maintains that each state’s sovereignty includes a responsibility to protect not only the rights of its citizens, but the rights of aliens in other lands, a responsibility of “saving strangers.” These alternative approaches all depart from the letter of international law about qualifying for state sovereignty. They are an extension of a greater emphasis on human rights as transnational moral norms. Alternative, normative understandings of the modern state show that, under certain conditions, humanitarian interventions are not violations of sovereignty at all.

c. The Problem of Selectivity

Among the most common objections to humanitarian interventions is “the vexed issue of selectivity.” The concern is that states or international organizations choose to intervene militarily in only some humanitarian emergencies: only some sufferings are selected for forceful action by outsiders. Some critics, such as Noam Chomsky (The New Military Humanism, 1999), see selectivity as undermining any and all moral merit to military interventions to protect basic human rights. If humanitarianism is the issue, why intervene here and not there? How does it come about that armed interventions take place in one crisis but not in another? How can it be morally acceptable that, though there are many emergencies warranting others’ forceful response, only some situations are selected for armed interventions and only some people’s basic human rights are defended by others’ military force?

The objection is sometimes seen in terms of ethical consistency: among all those situations where a humanitarian intervention is morally justifiable, in only some of those cases is an intervention conducted. It would appear that like cases are not being treated alike. The implicit appeal is to the universalizability of genuine moral judgments. But a lack of clarity or precision often cloaks the objection. Sufficient sufferings by people – threshold conditions – are only one feature of similarity between cases, and appeals to similarities of sufferings do not alone make the compared cases morally justified. There are other necessary conditions to justifying an intervention (for example, likelihood of success) and sometimes those are not met in the cases being compared. The selectivity problem arises only when one or more situations satisfy all the requirements for justifying uses of armed force. Second, interventions being morally justified is not inconsistent with only some interventions taking place. If being morally justified means there is a right to intervene, then, as with most rights, the right-holder can choose whether to exercise the right or not, whether to actually intervene or not. If being morally justified means there is an imperfect duty to rescue, then, as an imperfect duty, the obligated parties can choose when and where to discharge the duty. Third, understood as ethical inconsistency, selectivity seems hardly sufficient reason to reject all humanitarian interventions as unjustified. The moral flaw of inconsistency does not require doing nothing: because one cannot or does not do everything morally justified on similar grounds, it does not follow one ought not ever do what is morally right.

A second version of the objection points to the substantive criteria by which interventions are selectively conducted, and there is something right about this form of the objection. It is problematic if, among an array of justifiable interventions, states select only some situations for intervening based on morally suspect criteria, such as regional bias or media attention (what is called “the CNN effect”). For example, in the 1990s military force was employed in the Balkans for humanitarian purposes, but not in Rwanda. Assuming both situations warranted intervention, the issue, then, is not only ethical inconsistency, but suspicions about the ethical acceptability of the substantive criteria for selective action. So, for example, if there is moral right to intervene, is it not morally problematic if the right-holder chooses whether to intervene based on the race or region of those people suffering? If there is even an imperfect duty to intervene, is it not morally problematic to select those to be rescued based on whether they are European or Christian, or based on the extent and kind of media coverage provided? This version of the selectivity problem has merit in calling for diligence, discipline, and care in choosing how to exercise rights or discharge duties. It is not clear this version of the selectivity problem is sufficient reason to oppose all humanitarian interventions, unless the reliance on morally suspect criteria is pervasive or even unavoidable. And that leads to a third and the strongest version of the selectivity problem for humanitarian interventions.

The claim is that states selectively intervene based on national self-interest, not based on humanitarian need or warrant. Though seldom distinguished by critics of interventions, a weaker version of the objection is that, among morally justified interventions, states choose to intervene in those situations that serve their national interest. A kind of national prudence supervenes on the array of morally permissible interventions. It is not obvious that this is problematic for states any more than it is for individuals who invoke principles of prudence to choose among morally permissible possibilities. And it does not seem sufficient to reject humanitarian interventions as unjustified, even if, in fact, all states do combine moral and prudential consideration in selecting sites for intervention. A stronger version of the objection, however, reflects concerns about imperialistic ambitions or hegemonies of intervening parties. One can see this objection as a skepticism about what genuinely drives states’ decision about whether to intervene or not. The selectivity objection, then, is not so much concerned about moral flaws or inconsistency, but relies on the inescapable role of national self interest in deciding whether to intervene. Seen this way, the objection reflects political realism as an alternative framework for considering the international arena, including states and humanitarian interventions.

d. Political Realism

Political realism takes many forms, none of which support independent ethical norms as relevant to international relations, including states’ uses of armed force or war (See IEP article, “Political Realism”). Strong forms of descriptive realism maintain that all state action is in pursuit of national self-interest, typically understood in terms of national security, military or economic power, or material well-being. If all states’ actions are, in fact, motivated by self-interest, then state actions motivated solely or primarily by humanitarian considerations are not possible or morally justifiable. Such a strong form of descriptive political realism, however, is a dubious empirical generalization about international relations and about the scope and stability of what constitutes national interest. There are examples of states’ cooperating, of states sometimes acting on moral grounds, of states sometimes acting contrary to their national interest, or of states changing what constitutes their national interest (Buchanan and Golove, 873-874). Prescriptive political realism maintains that states should pursue their own national interests in the international arena: it advocates a norm of prudence, of pursuing self-interest, but not of morality, as properly governing state behavior. According to realisms, then, uses of armed force in defense of others’ human rights sometimes occur because it is in the national interest of the interveners. An example might be an intervention conducted in a bordering state, due to the national security interests threatened by having an inept or failed state as neighbor (for example, refugees and interruptions of oil or water supplies). The justifications for intervening, if any, are not moral principles, but appeals to promoting the intervener’s national interests, which is, the realist maintains, the way states do or should act.

Political realists’ amoralism about states’ actions typically correlates with a model of the international arena as analogous to Hobbes’s state of nature (See IEP article, “Social Contract Theory” and “Thomas Hobbes: Moral and Political Philosophy”):

There is no global power, no supreme power to enforce cooperation and peace;
there is relative equality of power among states;
each state should pursue its self-interest, by any feasible means, including by anticipatory domination of other states when possible;
and there is no morality applicable amidst pervasive mutual assurance problems.

In contrast, supporters of just war theory, universal human rights, and morally justified humanitarian interventions typically see the international arena as more analogous to Locke’s state of nature (See IEP article “John Locke: Political Philosophy”). There exist transnational moral norms (for example, of human rights, justice) that bind states and organizations in their relations to one another, including perhaps an international analogue to the Lockean executive right to punish and enforce those transnational norms, even by use of armed force. Though the proposed content of the transnational moral principles may vary, the relevance and applicability of moral principles opposes realists’ amoralism about international relations. Political realisms and defenders of morally justified wars or humanitarian interventions reflect fundamentally different conceptions of world order.

e. Post-Colonialism and Feminism

Post-colonialism’s attention to issues of power, representations in discourse, perspective, and history provides an alternative approach to issues of war and humanitarian interventions. For example, the selectivity issue (IV.C. above) is seen as about abuse of power, and about the discourse of rights, law, and “just war” masking imperialistic ambitions or hegemonies of intervening parties. Examples of abuse abound, it is argued, from the days of the Cold War to later incursions in the Middle East and Afghanistan (Gregory). The moral universalism of human rights and other concepts employed as intervention threshold conditions (II. above) are not neutral, with their emphasis on the individual, on negative civil rights, and on the rule of law. The discourse of just war thinking looks at uses of military force from the perspective of those deciding whether to wage war, not from the perspective of those against whom the war is waged or those who are suffering. Intervening parties, it is argued, are former colonial powers with lingering imperialist ambitions and those to be protected are former subjects of these imperialist ambitions. Given the asymmetries of power in the world, colonialism and imperialism continue in the way in which dominating powers structure and influence the lives of those around the world, so much so that there are nearly insurmountable obstacles for the subaltern speaking and being heard (Spivak). Post colonialist approaches call for skepticism, at least, about moral justifications for war or armed humanitarian interventions; and they call for involving diverse, alternative voices and thinking in response to human suffering.

Feminist thinking about humanitarian interventions includes challenges to the substance and implications of employing the “just war” framework. The questions posed by this approach “risk androcentric or sexist bias” and commonly proposed rules about just interventions “remain gendered in concealed ways” (Cudd, 360, 363). One challenge is to explore the ethics of care as an alternative approach (for example, Held). More direct challenges to the just war framework consider whether threshold conditions, such as genocide or crimes against humanity, incorporate rape and sexual atrocities that victimize women in particular (for example, Card), or whether oppression of women satisfies “just cause” requirements for using military force (Cudd, 369-370). Another concern is that proportionality requirements include among the effects of interventions consequences for enhancing or diminishing “women’s rights and power” and for the relational autonomy of individuals as that concept has been developed in other feminist work in ethics (Cudd, 366). The suggestions often include a call for more attention to non-military, preventive action to address human rights issues, including traditional gender roles and hierarchies.

f. Pacifism

A final consideration is another source of challenge to humanitarian interventions: pacifism. Just war theory’s attempt to delineate some wars as morally justified is between political realism’s denial that morality applies to state behavior and pacifism’s rejection of all war, killing, or violence by states. Among the many varieties of pacifism, relevant to questions about humanitarian interventions is absolute anti-war pacifism, and, in this context, what is often called “just war pacifism.” Using typical just war requirements – jus ad bellum and jus in bello – it is argued that no war has or even can satisfy all jus bellum standards, including, it would follow, wars fought for humanitarian purposes. In effect, just war pacifism opposes all wars by applying rigorously and strictly all the standard requirements for a war to be justified.

Arguments for just war pacifism typically focus on a few jus bellum requirements: proportionality considerations, in bello discrimination as providing immunity for non-combatants, and the idea of war only as a last resort. Calling attention to the undeniable destructive consequences of war and use of military force, just war pacifists deny that the benefits do or can sufficiently outweigh the costs. Proportionality requirements are interpreted and applied in ways that they are not or can never be satisfied, even by uses of military force for humanitarian purposes. The argument depends on complex causal estimations and calculations about which certainty or reliability is dubious. Just war pacifism sees macro-proportionality as capable of much more justificatory work than it is accorded by many just war theorists. The argument is more effectively employed with respect to micro-proportionality and the in bello discrimination principle. Warring parties cannot avoid what is euphemistically called “collateral damage” — the death of non-combatants and and destruction of non-military property – despite the features of contemporary warfare, with its “smart bombs,” drones, and technological targeting controls. Just war pacifism rightly attends to this feature. The pacificists’ argue that even with modern technology, levels of collateral damage remain too high to be morally justifiable. Morally acceptable standards are not and cannot be satisfied; thus, even if all ad bellum standards are met, no war is a just war. The difficulty with the argument is establishing precise levels of morally acceptable death and destruction for non-combatants, whether seen as unintended consequences or not. Of course, if no “collateral damage” is morally permissible, then it would seem that no war, no humanitarian intervention, could be a truly just war. Finally, just war pacifism demands that war be a last resort and argues that always there are or can be non-military alternatives. These arguments typically turn on how to construe the last resort requirement. As mentioned above, a literal reading of the idea excludes most wars or interventions as unjust; and an expansive, counterfactual construal of the requirement makes no wars just, but tends to conflate advocacy for better preventive infrastructure and strategies with justifying responses to developing events. Just war pacifism, like any absolute, unconditional opposition to war and use of military force, must somehow negotiate a troubling moral path whereby innocent persons will not be rescued because of a superior principle prohibiting the use of armed force, even for humanitarian purposes to stop widespread, systematic human suffering.

5. References and Further Reading

Bass, Gary J. Freedom’s Battle: The Origins of Humanitarian Intervention. New York: Random House, 2008.
- An easily readable rendition of modern cases of interventions in order to show that “all of the major themes of today’s heated debates about humanitarian intervention … were voiced throughout the nineteenth century.”
Buchanan, Allen. Justice, Legitimacy, and Self-Determination: Moral Foundations for International Law. Oxford: Oxford University Press, 2004.
- A significant contribution to a number of issues and discussions, albeit challenging in its sophistication and conclusions rooted in a Kantian approach to moral theory.
Buchanan, Allen, and David Golove. “Philosophy of International Law,” in The Oxford Handbook of Jurisprudence & Philosophy of Law. Ed. Jules Coleman and Scott Shapiro. Oxford: Oxford University Press, 2002. 868-934.
- A defense and description of normative philosophy of international law, including attention to political realism, legal positivism, transnational distributive justice, human rights, secession, and humanitarian intervention (but not including just war theory).
Card, Claudia. “The Paradox of Genocidal Rape Aimed at Forced Pregnancy.” The Southern Journal of Philosophy 46 (2008): 176-189.
Chatterjee, Dee K., and Don E. Scheid, eds. Ethics and Foreign Intervention. Cambridge: Cambridge University Press, 2003.
- A collection of contributions to conceptual and normative issue of humanitarian intervention, the merits and limits of the “just war” approach, law and secession, and critiques of interventionism. Especially recommended are the contributions by essays by Hoffmann, Brown, Lucas, and Coady.
Cudd, Ann E. “Truly humanitarian intervention: considering just causes and methods in a feminist cosmopolitan frame.” Journal of Global Ethics 9 (2013): 359-375.
Fletcher, George P., and Jens D. Ohlin. Defending Humanity: When Force is Justified and Why. New York: Oxford University Press, 2008.
Govier, Trudy. “War’s Aftermath: The Challenge of Reconciliation.” War: Essays in Political Philosophy. Ed. Larry May. Cambridge: Cambridge University Press, 2008. 229-248.
Gregory, Derek. The Colonial Present: Afghanistan, Palestine, Iraq. Wiley-Blackwell, 2004.
Held, Virginia. “Military Intervention and the Ethics of Care.” The Southern Journal of Philosophy 46 (2008): 1-20.
Hoffman, Stanley. The Ethics and Politics of Humanitarian Intervention. Notre Dame: University of Notre Dame Press, 1997.
Holder, Cindy. “Responding to Humanitarian Crises.” War: Essays in Political Philosophy. Ed. Larry May. Cambridge: Cambridge University Press, 2008. 85-104.
Holzgrefe, J. L., and Robert O. Keohane, eds. Humanitarian Intervention: Ethical, Legal, and Political Dilemmas. Cambridge: Cambridge University Press, 2003.
- A collection of contributions, including an excellent survey of philosophic issues in the humanitarian intervention debate by Holzgrefe. Other contributors address issues of international law, global ethics, and state sovereignty.
International Commission on Intervention and State Sovereignty (ICISS). The Responsibility to Protect: Report of the International Commission, and Supplementary Volume to the Report. Ottawa: International Development Research Centre, 2001.
- The Report is a pithy summary of major issues, of a defense of interventions in terms of “just war” principles, and with attention to institutional and legalities related to the UN. The supplementary volume includes experts’ background essays on history and major issues (for example, “State Sovereignty,” “Prevention”), presentations of numerous cases, and extensive bibliographies organized by facets of the debates and controversies about humanitarian interventions.
Johnson, James Turner. Morality and Contemporary Warfare. New Haven: Yale University Press, 1999.
- A historically informed approach to the just war tradition and the theory’s suitability for today’s world. Chapter 3 is devoted to “the question of intervention.”
Jokic, Alexander, ed. Humanitarian Intervention: Moral and Philosophical Issues. Toronto: Broadview Press, 2003.
- A collection of conference papers; especially recommended are contributions by Ellis, Wilkins, Pogge, and Buchanan.
Lang, Anthony F., ed. Just Intervention. Washington, D.C.: Georgetown University Press, 2003.
- Especially relevant are contributions by Nardin, Chesterman, Weiss, and Cook.
Lee, Steven P. Ethics and War: An Introduction. Cambridge: Cambridge University Press, 2012.
- A comprehensive, sophisticated introduction to “just war” theory which includes advocating a “human rights paradigm” to address interventions and questions of state sovereignty.
Lucas, Jr., George R. Perspectives on Humanitarian Military Intervention. Berkeley: University of California Press, 2001.
Nardin, Terry, and Melissa S. Williams, eds. Humanitarian Intervention. NOMOS XLVII. New York: New York University Press, 2006.
Orend, Brian. The Morality of War. Second edition. Toronto: Broadview Press, 2013.
- Written by one of the major contributors to contemporary just war theory, including extensive attention to jus post bellum issues.
Orford, Anne. Reading Humanitarian Intervention. Cambridge: Cambridge University Press, 2003.
Rawls, John. The Law of Peoples. Cambridge: Harvard University Press, 1999.
- This work is prominent in discussions of a host of issues in global ethics and international law. An extension of his landmark social contract argument in A Theory of Justice (1971), in the context of a contractarian theory of international society and law, this work briefly addresses human rights, just wars, and interventions.
Smith, Michael. “Humanitarian Intervention: An Overview of the Ethical Issues.” Ethics and International Affairs 12 (1998): 63-79.
Spivak, Gayatri Chakravorty.“Can the Subaltern Speak?” Marxism and the Interpretation of Culture. Ed. C. Nelson and L. Grossberg. University of Illinoise Press, 1988. 271-313.
Teson, Fernando R. Humanitarian Intervention: An Inquiry into Law and Morality. Third edition. Ardsley, NJ: Transnational Publishers, 2005.
- Written by an international law professor, this volume develops a philosophic and legal defense of interventions from a decidedly liberal, Kantian perspective.
Walzer, Michael. Just and Unjust Wars: A Moral Argument with Historical Examples. Third edition. New York: Basic Books, 1977, 2000.
- Now in a fourth edition, this volume has become the classic, early 21^st century discussion of “Just War” theory in its entirety. Chapter 6 is devoted to interventions and the Preface to the Third Edition succinctly outlines major issues for morally justifying humanitarian interventions.
Walzer, Michael. Arguing about War. New Haven: Yale University Press, 2004.
- A collection of essays addressing developing issues in just war theory, including humanitarian interventions (see especially selection 5, “The Politics of Rescue”).
Wheeler, Nicholas J. Saving Strangers: Humanitarian Intervention in International Society. Oxford: Oxford University Press, 2000.
- Discussions of the most prominently discussed cases of the last half century, to explore “how different theories of international society lead to different conceptions of the legitimacy of humanitarian intervention.”

Author Information

Robert Hoag
Email: Bob_Hoag@berea.edu
Berea College
U. S. A.

Epistemic Consequentialism

Consequentialism is the view that, in some sense, rightness is to be understood in terms of conduciveness to goodness. Much of the philosophical discussion concerning consequentialism has focused on moral rightness or obligation or normativity. But there is plausibly also epistemic rightness, epistemic obligation, and epistemic normativity. Epistemic rightness is often denoted with talk of justification, rationality, or by merely indicating what should be believed. For example, my belief that I have hands is justified, while my belief that I will win the lottery is not; Alice’s total belief state is rational, while Lucy’s is not; we all should be at least as confident in p or q as we are in p. The epistemic consequentialist claims, roughly, that these kinds of facts about epistemic rightness depend solely on facts about the goodness of the consequences. In slogan form, such a view holds that the epistemic good is prior to the epistemic right.

Many epistemologists seem to have sympathy for the basic idea behind epistemic consequentialism, because many epistemologists have been attracted to the idea that epistemic norms that describe appropriate belief-forming behavior ultimately earn their keep by providing us with some means to garner what is often thought to be the epistemic good of accurate beliefs. Consequentialist thinking has also gained popularity among more formally minded epistemologists, who apply the tools of decision theory to argue in consequentialist fashion for various epistemic norms. And there is also a consequentialist strand in certain areas of philosophy of science, especially those areas that attempt to explain how it is that science as a whole might have considerable epistemic success even if individual scientists are acting irrationally. Thus, there is a kind of prima facie plausibility to epistemic consequentialism.

Consequentialism
Final Value and Veritism
Consequentialist Theories
Summing Up: Some Useful Distinctions
Objections to Epistemic Consequentialism
References and Further Reading

1. Consequentialism

There is unfortunately no consensus about what precisely makes a theory a consequentialist theory. Sometimes it is said that the consequentialist understand the right in terms of the good. Somewhat more generally, but still imprecisely, we could say that the consequentialist maintains that normative facts about Xs (for example, facts about the rightness of actions) depend solely on facts about the value of the consequences of Xs. In light of this, some see consequentialism as a reductive thesis: it purports to reduce normative facts (for instance, about what one ought to do) to evaluative facts of a certain sort (for instance, about what is good). Smith (2009) and others, however, mark what is distinctive about consequentialism differently. Some maintain that a consequentialist is committed to understanding what is right or obligatory in terms of what will maximize value (Smart and Williams 1973, Pettit 2000, Portmore 2007). Still others maintain that a consequentialist is one who is committed to only agent-neutral, rather than agent-relative prescriptions (where an example of an agent-relative prescription is one that instructs each person S to ensure that S not lie, whereas an agent-neutral prescription instructs each person S to minimize lying) (McNaughton and Rawling 1991). And finally, some maintain that what is distinctive about consequentialism is the lack of intrinsic constraints on action types (Nozick 1974, Nagel 1986, Kagan 1997).

Perhaps the best way to elucidate consequentialism, then, is to point to paradigm cases of consequentialist theories and attempt to generalize from them. On this score there is some agreement: classic hedonic utilitarianism (of the sort defended by Bentham and Mill) is thought to be a clear instance of a consequentialist theory. That theory maintains that an action is morally right if and only if the total sum of pleasure minus pain that results from that action exceeds the total sum of pleasure minus pain of any alternative to that action. The normative facts here are facts about the moral rightness of actions and the utilitarian claims that these facts depend solely on facts about the moral goodness of the consequences of actions, where moral goodness is measured by summing up total pleasure minus total pain.

Though it is not possible to give an uncontroversial set of necessary and sufficient conditions for a theory being a species of consequentialism, it is useful to see that there is some sort of unity to views, such as hedonic utilitarianism, normally classified as consequentialist. The following three-step “recipe” for a consequentialist theory evinces this unity, and will be useful to refer to later. (A similar recipe is given by Berker 2013a,b.)

Step 1. Final Value: identify what has final value, where something has final value iff it is valuable for its own sake (sometimes the term “intrinsic value” is used in the same way).

Example: For the classic hedonic utilitarian, pleasure is the sole thing of final value and pain is the sole thing of final disvalue; thus, final value is here generalizing the concept of moral goodness above.

Step 2. Ranking: explain how certain things relevant to the normative facts you care about are ranked in virtue of their conduciveness to things with final value.

Example: The normative facts of interest to the classic hedonic utilitarian are facts about the rightness and wrongness of actions, so actions are the relevant things to rank. The classic hedonic utilitarian says that actions can be ordered by calculating for each action the sum of the total final value in the consequences of that action.

Step 3. Normative Facts: explain how the normative facts are determined by facts about the rankings.

Example: The classic hedonic utilitarian says that an action a is right if and only if it is ranked at least as high as of any action that is an alternative to a.

2. Final Value and Veritism

Before looking at specific consequentialist epistemic theories, it is worth saying something about what epistemic consequentialists typically think about the first step in the recipe, which concerns final value. Many who are sympathetic to epistemic consequentialism also adhere to veritism (the term is due to Goldman 1999; Pritchard 2010 calls this view epistemic value t-monism). According to veritism, the only thing of final epistemic value is true belief and the only thing of final epistemic disvalue is false belief. Generalizing somewhat so that the view can capture approaches that think of belief as graded, we can say that according to veritism, the only thing of final epistemic value is accuracy and the only thing of final epistemic disvalue is inaccuracy. Not all epistemic consequentialists are veritists; others have thought that there is more to final epistemic value than mere accuracy, such as the informativeness or interestingness of the propositions believed, or whether the propositions believed are mutually explanatory or coherent. Others have thought that things such as wisdom (Whitcomb 2007), understanding (Kvanvig 2003), or a love of truth (Zagzebski 2003) have final epistemic value.

But even those consequentialists who think that accuracy does not exhaust what is epistemically valuable tend to think that accuracy is an important component of final epistemic value (for an alternative view, see Stich 1993). It is not hard to see why such a view is theoretically attractive. Although all explanations must come to an end somewhere, it seems that veritism, or at least something like it, is in a good position to give satisfying explanations of our epistemic norms. Veritism together with consequentialism can do so by showing how conforming to that norm conduces toward the goal of accuracy. If one could show, say, that by respecting one’s evidence one is likely to hold accurate beliefs, then one has a better explanation for an evidence-respecting norm than does the person who says such a norm is simply a brute epistemic fact.

Questions about final epistemic value are important for would-be epistemic consequentialists. This article notes the different views that epistemic consequentialists have held concerning final epistemic value, but there is little substantive discussion about the advantages and disadvantages of competing views about final epistemic value. That said, the debate concerning the nature of final epistemic value is an important debate for epistemic consequentialists to watch. In particular, the epistemic consequentialist will need a notion of final epistemic value according to which final epistemic value is the sort of thing that it makes sense to promote.

3. Consequentialist Theories

In light of the consequentialist recipe above, a specific epistemic consequentialist theory can be obtained by specifying the bearers of final epistemic value, the principle by which options are then ranked in terms of final epistemic value, and the normative facts that this ranking determines. Below, specific epistemic consequentialist theories are presented in this way.

a. A Simple Example

For illustrative purposes, consider a very simple consequentialist theory. According to this view, the only thing of final epistemic value is true belief. Then, say that a belief is justified to the extent that it garners epistemic value for the believer. This can be put in the consequentialist recipe as follows:

Step 1. Final Value: True beliefs have final epistemic value; false beliefs have final epistemic disvalue.

Step 2. Ranking: The normative facts at issue are facts about whether beliefs are justified, so beliefs are the natural thing to rank. According to this view, S’s belief that p is ranked above S’s belief that q iff the belief that p in itself and in its causal consequences garners more epistemic value for S than the belief that q.

Step 3. Normative Facts: The belief that p is justified iff it is ranked above every alternative to believing p.

One might think that this simple view has a relatively obvious flaw. It seems to imply that every true belief is justified and every false belief unjustified. This is what Maitzen (1995) argues:

If one seeks, above all else, to maximize the number of true (and minimize the number of false) beliefs in one’s (presumably large) stock of beliefs, then adding one more true belief surely counts as serving that goal, while adding a false belief surely counts as disserving it. (p. 870)

As clear as this seems, it is actually mistaken. For although the belief that p (when p is false) will not directly add value to S’s belief state, such a false belief may have an effect on other beliefs that S forms later and so, in total, be preferable to adopting the true belief that ~p. That said, no one has defended such a simple version of epistemic consequentialism. In actual practice, the relationship between final epistemic value and epistemic justifiedness is not proposed to be as direct as this simple view would have it. With that, we turn to examine such views.

b. Cognitive Decision Theory

Suppose that we think that rational agents have degrees of belief that can be represented by probability functions, but we think there are still important all-or-nothing epistemic options that these agents have regarding which propositions they accept as true. Patrick Maher (1993), for instance, argues that even if we think of scientists as having degrees of belief, we still need a theory of acceptance if we want to understand science. Why is this? Maher defines accepting that p as sincerely asserting that p (this is not the only definition of acceptance; van Fraassen (1980), though he is writing primarily about subjective probability, thinks of acceptance as a kind of cognitive commitment; Harman (1986, p. 47) sees acceptance as the same as belief and says that one accepts p when (1) one allows oneself to use p as part of one’s starting point for further reasoning and when (2) one takes the issue whether p to be closed in the sense that one is no longer investigating that issue). Further, Maher maintains that the scientific record tells us about which theories scientists asserted not about what credences scientists had. Thus, a theory of acceptance (in the sense of sincere assertion) is needed to understand science on Maher’s view.

If we think of things roughly in this way, then it is natural to turn to decision theory to determine what propositions agents should accept. Decision theory tells an agent which action it would be rational to perform based on a ranking of each action available to the agent in terms of the action’s expected value. To find the expected value of an action for an agent, one considers each set of consequences the agent thinks is possible given the performance of that action, and then sums up the value of those consequences, weighted by the agent’s degrees of belief that those consequences are realized conditional on that action. An action is then taken to be rational iff no other action is ranked higher than it in terms of expected value. When considering which proposition it would be rational for an agent to accept, it is natural to set things up similarly. Instead of evaluating the usual type of actions, one evaluates acts of acceptance of propositions that are available to the agent. These different acts of acceptance can be ranked in terms of the expected final epistemic value of each act of acceptance.

Such an approach to acceptance is briefly discussed by Hempel (1960). Isaac Levi (1967) presents a more complete theory of this kind. Levi imagines that a scientist has a set of mutually exclusive and jointly exhaustive hypotheses h₁, h₂,…,h_n and that the scientist’s options for acts of acceptance are to accept one of the h_i or to accept a disjunction of some of them. We suppose that scientists have subjective probability functions, which reflect the evidence that they have gathered with respect to the hypotheses in question. Levi’s basic proposal is that agents should accept some hypothesis (or disjunction of hypotheses) if so-doing maximizes expected final epistemic value where the weight for the expectation is provided by the subjective probability function (this is very similar to, though not identical to, the weighting in terms of degrees of belief mentioned above). What is final epistemic value for Levi (Levi uses the term “epistemic utility”)? According to Levi, final epistemic value has two dimensions that correspond to what the goals of any disinterested researcher ought to be. The first dimension is truth. True answers are valued more than false answers. The second dimension is “relief from agnosticism.” The idea here is that more-informative answers (for example, “X wins”) are valued more than less-informative answers (for example, “X or Y wins”). These values pull in opposite directions. One can easily accept a true proposition if informativeness is ignored as the disjunction “X wins or X does not win” is sure to be true. Similarly, one can easily accept an informative proposition if truth is ignored. Accordingly, Levi defines a family of functions that balance these two dimensions of value. He does not settle on one way of balancing, but instead considers as permissible the whole family functions that balance these two dimensions of value in different ways.

Several features of Levi’s approach are worth noting. First, note that on Levi’s view it can happen that the proposition a scientist should accept is not the one that the scientist sees as most probable, because final epistemic value is a function of both the truth/falsity of the proposition and its informativeness.

The second point worth noting brings us to an important distinction when considering epistemic consequentialism. Levi is interested in the expected final epistemic value of accepting some proposition h₁, but where the value of the consequences of accepting h₁ include only the value of accepting h₁ and not the causal consequences of this acceptance. That is, suppose an agent has the option of accepting h₁ or accepting h₂. Suppose that h₁ is both more likely to be true and more informative than h₂. So on any weighting, and on any final epistemic value function, accepting h₁ will rank higher than h₂ if we ignore the later causal consequences of these acts of acceptance. But suppose that accepting h₂ is known to open up opportunities for garnering much more final epistemic value later (perhaps by allowing one to work on a research project only open to those who accept h₂). Levi’s theory says that the agent should accept h₁, not h₂. Thus, it is a form of consequentialism that ignores the causal consequences of the options being evaluated. What matters are not the causal consequences of accepting h₁, but rather the expected final value of the acceptance of h₁ itself, ignoring its later causal consequences.

One might argue that this feature of Levi’s view is enough to make it thereby not a form of consequentialism, because it is not faithful to the idea that the total set of causal consequences of an option (for example, an action or a belief or an act of acceptance) is relevant to the normative verdict concerning that option. Be that as it may, there is still a teleological structure to Levi’s view: acts of acceptance inherit their normative properties in virtue of conducing to something with final epistemic value. It is just that “conducing” is construed noncausally, in this case as something more akin to instantiation (Berker (2013a,b) explicitly allows such views to count as instances of epistemic consequentialism or epistemic teleology—he uses both terms). For future reference, I will use the term “restricted consequentialism” to refer to views that are teleological in the sense of Levi’s view, but do not take the total set of causal consequences of an option to be relevant to its normative status. In section 5, this distinction is examined more carefully.

Cognitive decision theory fits into our consequentialist recipe as follows:

Step 1. Final Value: Accepting propositions that are true has final epistemic value, and accepting propositions that are informative has final epistemic value. The total final epistemic value of accepting a proposition is a function of both its truth and its informativeness, though the way that these values are balanced can permissibly differ from agent to agent.

Step 2. Ranking: The act of accepting some answer to a question is ranked according to its subjective expected final epistemic value.

Step 3. Normative Facts: One should accept answer a to question Q iff accepting a is ranked at least as high as every other alternative answer to Q.

For criticism of this approach, see Stalnaker (2002) and Percival (2002).

c. Accuracy First

Cognitive decision theory takes for granted that agents have a certain kind of doxastic state, represented by a probability function, and uses this to tell us about the norms for the different kind of doxastic state of acceptance. But suppose that one does not want to take for granted such an initial doxastic state. Does decision theory have anything to offer such an epistemic consequentialist?

James Joyce (1998) shows that the answer to this question is “yes” if we accept certain assumptions about final epistemic value that many find plausible. Joyce argues that degrees of belief—henceforth, credences—that are not probabilities are accuracy-dominated by credences that are probabilities. A credence function, c, is accuracy-dominated by another, c¢, when in all possible worlds, the accuracy of c¢ is at least as great as the accuracy of c, and in at least one world, the accuracy of c¢ is greater than the accuracy of c (for an introduction to possible worlds, see IEP article Modal Metaphysics). Joyce uses this, plus some assumptions about final epistemic value to establish probabilism, the thesis that rational credences are probabilities.

As Pettigrew (2013c) has noted, the basic Joycean framework requires one to do three things. First, one defines a final epistemic value function (often called an “epistemic utility function”). Second, one selects a decision rule from decision theory. Finally, one proves a mathematical theorem of the sort that says only doxastic states with certain features are permissible given the decision rule and final epistemic value function. Let us consider each of these steps in turn.

The final epistemic value functions that are typically used are different in kind than the functions used in cognitive decision theory. Whereas the final epistemic value functions in cognitive decision theory tend to value both accuracy—that is, truth and falsity—and informativeness, the final epistemic value functions in the Joycean tradition value only accuracy (this is why the moniker “accuracy first” is appropriate). Accuracy can be understood in different ways. There are two main issues here: (1) what counts as perfect accuracy? (2) how does one measure how far away a doxastic state is from perfect accuracy? With respect to (1), Joyce (1998) takes a credence function to be perfectly accurate at a world when the credence function matches the truth-values of propositions in that world (that is, assigns 1s to the truths and 0s to the falsehoods). Many have followed him in this, although there are alternatives (for example, one could think that a credence function is perfectly accurate at a world if it matches the chances at that world rather than the truth-values at that world). With respect to (2), things get more complicated. The appropriate mathematical tool to use to calculate the distance a credence function is from perfect accuracy is a scoring rule, that is, a function that specifies an accuracy score for credence x in a proposition relative to two possibilities: the possibility that the proposition is true and the possibility that it is false. There are many constraints that can be placed on scoring rules, but one popular constraint is that the scoring rule be proper. A scoring rule is proper if and only if the expected accuracy score of a credence of x in a proposition q, where the expectation is weighted by probability function P, is maximized at x = P(q). Putting together a notion of perfect accuracy and a notion of distance to perfect accuracy yields a final epistemic value function that is sensitive solely to accuracy. One proper scoring rule that is often used as a measure of accuracy is the Brier score. Let v_w(q) be a function that takes value 1 if proposition q is true at possible world w and that takes value 0 if proposition q is false at possible world w. Thus, v_w(q) merely tells us whether proposition q is true or false at possible world w. In addition, let c(q) be the credence assigned to proposition q, and let be the set of propositions to which our credence function assigns credences. Then the Brier score for that credence function at possible world w is:
This will give us an accuracy score for every credence function for any world we please. Suppose, for example, that we are considering two credence functions defined over only the proposition q and its negation:

c₁(q) = 0.75 c₂(q) = 0.8

c₁(~q) = 0.25 c₂(~q) = 0.3

There are two possible worlds to consider: the world where q is true and the world where it is false. In the world (call it “w1”) where q is true, the Brier score for each credence function is as follows:

As one can verify, c₁ scores better than c₂ in a world where q is true. Now, consider a world where q is false (call this world “w2”):

Again, as one can verify, c₁ scores better than c₂ in a world where q is false.

Once one has a final epistemic value function, such as the Brier score, one must pick a decision rule. Joyce (1998) uses the decision rule that dominated options are impermissible. In the example immediately above, c₁ is dominated by c₂ because c₁ scores better than or equal to c₂ in every possible world. Thus, c₂ is an impermissible credence function to have.

Our example considers only two very simple credence function. The final step in Joyce’s program is to prove a mathematical theorem that generalizes the specific thing we saw above. Joyce (1998) proves that for certain choices of accuracy measures, including the Brier score, every incoherent credence function is dominated by some coherent credence function, where a credence function is coherent iff it is a probability function. (Note that in our example, c₂ is incoherent while c₁ is coherent, thus illustrating an instance of this theorem.) Recall that probabilism is the thesis that rational credence functions are coherent. If we take permissible credence functions to be rational credence functions and if we can prove that no probabilistically coherent function is dominated by some probabilistically incoherent function—something that Joyce (1998) does not prove, but that is proven in Joyce (2009)—then we have a proof of probabilism from some assumptions about final epistemic value and about an appropriate decision rule.

Others have altered or extended this approach in various ways. One alteration of Joyce’s program is to use a different decision rule, for instance, the decision rule according to which permissible options maximize expected final epistemic value. Leitgeb and Pettigrew (2010a,b) use this decision rule to prove that no incoherent credence function maximizes expected utility.

The results can be extended to other norms, too. For instance, conditionalization is a rule about how to update one’s credence function in light of acquiring new information. Suppose that c is an agent’s credence function and c_e is the agent’s credence function after learning e and nothing else. Conditionalization maintains that the following should hold:

For all a, and all e, c(a|e) = c_e(a), so long as c(e) ≠ 0.

In this expression, c(a|e) is the conditional probability of a, given e. Greaves and Wallace (2006) prove that, with suitable choices for accuracy measures, the updating rule conditionalization maximizes expected utility in situations where the agent will get some new information from a partition (a simple case of this is where an agent will either learn p or learn ~p). Leitgeb and Pettigrew (2010a,b) give an alternative proof that conditionalization maximizes expected utility.

Joyce is concerned with proving norms for degrees of belief. The approach can be extended to prove norms where all-or-nothing belief states are taken as primitive. Easwaran and Fitelson (2015) extend the approach in this way. Interestingly, their approach yields the result that some logically inconsistent belief states are permissible (for instance, in lottery cases). The approach has also been extended to comparative confidence rankings (where a comparative confidence ranking represents only certain qualitative facts about how confident an agent is in propositions—for instance, that she is more confident in p than in q). Williams (2012) has extended the approach in a different direction by examining cases where the background logic is nonclassical.

Joyce’s (1998) approach fits nicely into the consequentialist recipe (and subsequent work can be made to fit into the recipe, too):

Step 1. Final Value: Credences have final epistemic value in proportion to how accurate they are.

Step 2. Ranking: Credence functions are put into two classes: dominated credence functions and non-dominated credence functions.

Step 3. Normative Facts: A credence function is permissible to hold if and only if it is non-dominated.

In this way, the accuracy-first approach appears to be an especially “pure” version of epistemic consequentialism. The project is to work out what the epistemic norms are for doxastic states given that you care only about the accuracy of those doxastic states.

However, one prominent objection to the accuracy-first approach questions this. To see this, note that the verdicts about which credence functions dominate (or maximize expected epistemic value) are not sensitive to the total causal consequences of adopting a credence function as they only look at the expected epistemic value of that state and not the causal effects of the adoption of that state. There are really two points here. This first point is the same point that was noted with respect to cognitive decision theory: the accuracy-first program seems to be an instance of restricted consequentialism. This can make the view seem to not genuinely be a consequentialist view. Greaves (2013) raises some objections to the program along these lines; the issue she raises is very similar to the kinds of issues that Berker (2013a,b) and Littlejohn (2012) have raised in objections to epistemic consequentialism in traditional epistemology. The general worry is discussed below in section 5a.

The second point concerns a distinction that can be drawn between evaluating a doxastic state and evaluating the adoption of a doxastic state. The accuracy-first program seems to be interested in the former rather than the latter, which can make it seem further still from traditional consequentialism. This issue can be brought out by an example due to Michael Caie (2013). Suppose we are considering what the permissible credence function is with respect to only the propositions q and ~q where q is a self-referential proposition that says “q is assigned less than 0.5 credence.” This is an odd proposition in that if q is assigned less than 0.5 credence, then it is true (and so it would be more accurate to increase one’s credence in q), but if one increases one’s credence in q to 0.5 or greater, then q is false (and so it would be more accurate to decrease one’s credence in q). In such a situation, an incoherent credence function appears to dominate the coherent ones. To see this, note that there are no worlds where c(q) = 1, c(~q) = 0, and where q is true (because if c(q) =1, then q is false) or where c(q) = 0, c(~q) = 1, and where q is false (because if c(q) = 0, then q is true). The best that a coherent credence function can do is to assign c(q) = c(~q) = 0.5. In that case, q is false, and so the Brier score is 1.5. But compare this with the credence function, c*, according to which c*(q) = 0.5 and c*(~q) = 1. In that case, q is again false, and so c*(~q) gets a better score than does c(~q). Overall, c* gets a Brier score of 1.75.

How can this be, if we have proofs that probabilistically coherent credence functions dominate incoherent credence functions? The answer to this is that the proofs by Joyce and others assume a very strong kind of independence between belief states and possible worlds. Even though there is no world where c(q) = 1, c(~q) = 0, and where q is true, Joyce and others still consider such worlds when working out which credence functions dominate or maximize expected epistemic value. With these possible worlds back in play, the incoherent c* is dominated. In particular, for the desired results (that probabilism is true, that conditionalization is the correct updating rule, and so forth) to go through, we must be able to assess how accurate a doxastic state is in a world where that doxastic state could not be held. Further, we must maintain that facts about the accuracy of doxastic states in worlds where they cannot be held are sometimes relevant to our evaluation of a doxastic state in some other world where it is actually held. This might lead one to question whether this accuracy-first approach really is a form of epistemic consequentialism (though that is of course complicated by the fact that there is no consensus about what it takes to be a consequentialist theory) and indeed whether the evaluative framework can be motivated.

d. Traditional Epistemology: Justification

i. Coherentism

According to coherentism about justification, a belief is justified if and only if it belongs to a coherent system of beliefs (note that the term “coherent” here refers to some informal notion of coherence, perhaps related to, but distinct from, the notion of coherent credences). This on its own does not commit coherentists to any sort of epistemic consequentialism. However, some of the debates and claims made within the coherentist literature suggest that some prominent coherentists are committed to some form of epistemic consequentialism. For instance, in The Structure of Empirical Knowledge, BonJour (1985) defends a version of coherentism about justification. In this work, BonJour devotes an entire chapter to giving an argument for the following thesis:

A system of beliefs which (a) remains coherent (and stable) over the long run and (b) continues to satisfy the Observation Requirement is likely, to a degree which is proportional to the degree of coherence (and stability) and the longness of the run, to correspond closely to independent reality. (p. 171)

BonJour is thus attempting to show that the degree of coherence of a set of beliefs is proportional to the likelihood that those beliefs are true. He calls this a metajustification for his coherence theory of justification. And why is such a metajustification required? He writes:

The basic role of justification is that of a means to truth, a more directly attainable mediating link between our subjective starting point and our objective goal. […] If epistemic justification were not conducive to truth in this way, if finding epistemically justified beliefs did not substantially increase the likelihood of finding true ones, then epistemic justification would be irrelevant to our main cognitive goal and of dubious worth. […] Epistemic justification is therefore in the final analysis only an instrumental value, not an intrinsic one. (pp. 7–8)

This strongly suggests that BonJour thinks of the epistemic right—justification—in consequentialist terms (Berker (2013a) claims that BonJour (1985) should be understood in this way). If justification understood as coherence is not conducive to truth, then justification understood as coherence is not valuable. This suggests the following picture:

Step 1. Final Value: True beliefs have final epistemic value; false beliefs have final epistemic disvalue.

Step 2. Ranking: Sets of beliefs are ranked in terms of their degree of coherence where this degree of coherence is proportional to the likelihood that the set of beliefs is true.

Step 3. Normative Facts: A belief is justified iff it belongs to a set of beliefs that is coherent above some threshold.

The claim in Step 2, that coherence is truth-conducive, has been addressed explicitly in the literature, starting with Klein and Warfield (1994). They argue that the fact that one set of propositions is more coherent than another set does not entail that the conjunction of the propositions in the first set is more likely to be true than the conjunction of propositions in the second set. The basic argument is that a set of propositions (say, the set including a and b) can sometimes be made more coherent by adding an additional proposition to it (to yield the set including a, b, and c). However, the conjunction (a and b and c) is never more probable than the conjunction (a and b). Bovens and Hartmann (2003) and Olsson (2005) add to this literature and each prove results to the effect that no matter one’s measure of coherence, there will be cases where one set is more coherent than another, but its propositions are less likely. (For one response to these arguments, see Huemer (2011); Angere (2007) considers whether these arguments undermine BonJour’s coherentism.)

In light of difficulties establishing that coherence is truth-conducive, it is open to coherence theorists to not go down the consequentialist route. Such a coherentist might maintain that beliefs that are members of coherent sets are epistemically right independent of whether such sets are likely to be true. This mimics the non-consequentialist Kantian who maintains that certain actions are right independent of the final value that taking these actions leads to.

ii. Reliabilism

Reliabilism about justification, as championed by Alvin Goldman (1979), maintains that beliefs are justified when they are produced by suitably reliable processes. Put another way, beliefs are justified when produced by the right kinds of processes, and the right kinds of processes are those that are truth-conducive. One helpful way to think about the consequentialist structure of reliabilism is to think of it as analogous to rule utilitarianism. According to the rule utilitarian, we evaluate moral rules for rightness directly in terms of the consequences of their widespread acceptance. Actions are then evaluated in terms of whether or not they conform to a right rule. Similarly, according to reliabilism, the things up for direct consequentialist evaluation are not acts of acceptance or particular beliefs that could be adopted. Rather, processes of belief formation are evaluated consequentially. Reliabilists tend to see true belief as the sole thing of final epistemic value. Processes are thus evaluated based on their truth-ratios, the ratio of true beliefs produced to total beliefs produced. However, unlike a maximizing theory, reliabilism maintains that a process is acceptable just in case it has a truth-ratio above some absolute threshold. It is thus different from maximizing theories in two ways. First, a process can be acceptable even if it is not the most reliable process and thus not the optimally truth-conducive process. Second, a process need not be acceptable even if it is the most reliable process, because the reliabilist requires that processes meet some minimum threshold to be acceptable.

We can put a simple version of reliabilism about justification into our consequentialist recipe:

Step 1. Final Value: True beliefs have final epistemic value; false beliefs have final epistemic disvalue.

Step 2. Ranking: Processes are put into two classes: acceptable and not acceptable. If the process has a reliability score at or above the threshold, the process is acceptable; otherwise, it is not acceptable. The reliability score of a process p at world w is given by the sum of the true beliefs that process p produces at w divided by the sum of the total beliefs that process p produces at w (that is, the truth-ratio of p at w).

Step 3. Normative Facts: A belief is justified for S at t at w iff S’s belief at t at w is produced by an appropriate belief-forming process at w.

There are subtle ways in which reliabilism can differ from what the recipe above suggests. One of the most notable differences concerns Goldman’s (1986) approach. Although Goldman (1979) gives a theory that looks very much like what is represented above, in Goldman (1986) it is not individual processes that are ranked at Step 2, but rather systems of rules about which processes may and may not be used. A system of rules is then acceptable if and only if a believer who follows those rules has an overall truth-ratio above a certain threshold. Thus, the analogy to rule utilitarianism is even stronger in Goldman (1986) than in Goldman (1979), something which he explicitly notes. There has also been some dispute among reliabilists about the exact way that processes should be scored for their reliability (and so the exact form of Step 2), but despite that, the view looks to be committed to some form of consequentialism.

iii. Evidentialism

One of the main rivals of reliabilism about justification is evidentialism, initially defended by Richard Feldman and Earl Conee (1985) (whether evidentialism is a rival of coherentism depends subtly on exactly how the views are spelled out). Evidentialism maintains that the belief that p is justified for an agent at time t iff p is supported by the agent’s total evidence at t. Conee (1992) motivates the total evidence requirement with reference to an overriding goal of true belief, in which case evidentialists agree with reliabilists and with BonJour-style coherentists that justification is a matter of truth conduciveness. Feldman (2000) motivates the total evidence requirement with reference to an overriding goal of reasonable belief (rather than true belief), in which case evidentialists disagree with reliabilists and BonJour-style consequentialists about the nature of final epistemic value, but agree that justification should be spelled out in consequentialist terms. More recently, Conee and Feldman (2008) have suggested that what has final epistemic value is coherence. Whether this view is committed to consequentialism depends on how the details are spelled out. If the idea is that a doxastic state is justified in proportion to how much it promotes the value of coherence, whether in itself or in its causal consequences, then such a view is plausibly committed to consequentialism, with the good of coherence substituted for the good of true belief. However, there may be other ways of interpreting their view according to which it looks less committed to consequentialism.

It should be noted that Feldman (1998) makes clear that the only thing relevant to whether one should believe p is one’s evidence now concerning p’s truth. The causal consequences of believing p are explicitly ruled out by Feldman as relevant to that belief’s justificatory status. So if Feldman is to count as a consequentialist, it is of a very restricted sort. Presumably, Feldman holds something similar in Conee and Feldman (2008). Conee (1992), on the other hand, has expressed more sympathy with the idea that we should sometimes sacrifice epistemic value now for more epistemic value later. Thus, there is perhaps a stronger case that Conee’s version of evidentialism is also some form of consequentialism.

e. Traditional Epistemology Not Concerned with Justification

Stephen Stich (1990) offers a method of epistemic evaluation not concerned with justification, but that is committed to consequentialism. According to Stich, there are no special epistemic values (such as true belief), there are just things that people happen to value. Reasoning processes and reasoning strategies are seen as one tool that we use to get what we value. Stich (1993, p. 24) writes: “One system of cognitive mechanism is preferable to another if, in using it, we are more likely to achieve those things that we intrinsically value.” Thus, we have cognitive mechanisms being ranked in terms of their consequences, but where the consequences that matter are not uniquely epistemic, but rather anything that we happen to intrinsically value.

Richard Foley’s (1987) The Theory of Epistemic Rationality is not directed at analyzing justification. Nevertheless, it provides another example of work in traditional epistemology that seems to be committed to some form of epistemic consequentialism. Foley identifies our epistemic goal as that of now believing those propositions that are true and not now believing those propositions that are false. It is then epistemically rational for a person to believe a proposition whenever on careful reflection that person has reason to believe that believing that proposition will promote his or her epistemic goals, provided that all else is equal. Foley is clear, however, that he does not intend his view to sanction as rational adopting a belief that one is now confident is false in order to garner more true beliefs later. Thus, like some of the other views canvassed here, Foley adopts something like a consequentialist framework for evaluating beliefs, but in a restricted way, where the causal consequences of beliefs are not relevant to the normative verdicts of those beliefs.

Though a large focus of Goldman (1986) is to give a reliabilist account of justification, he notes that there are other important ways that processes, and thus that beliefs produced by those processes, can be evaluated. In particular, Goldman considers evaluating processes for their speed and for their power. The speed of a process concerns how quickly a process issues true beliefs. The power of a process concerns how much information a process gives to you. A highly reliable process might have very little speed if it takes a very long time to issue a belief. And the same highly reliable process might have very little power if it produces only that one belief. Goldman suggests that we can use a consequentialist-style analysis to evaluate processes in these ways, too.

Bishop and Trout (2005) argue against the practice of so-called standard analytic epistemology, which includes many of the approaches to justification looked at above. Bishop and Trout propose a view according to which we evaluate reasoning strategies by drawing on empirical work in psychology, rather than by consulting our intuitions. According to Bishop and Trout, the three factors that affect the quality of a reasoning strategy are: (1) whether the strategy is reliable across a wide range of problems, (2) the ease with which the strategy is used, and (3) the significance of the problems toward which the reasoning strategy can be used. They emphasize that whether a set of reasoning strategies is an excellent one to use depends on a cost/benefit analysis. It is natural, then, to think of their normative verdicts about whether a reasoning strategy is excellent as depending on the consequences of using that strategy along dimensions (1)–(3).

In this section and in the one before, we have seen that some traditional epistemologists with otherwise diverse views about justification or epistemic evaluation more generally seem to be committed, at bottom, to a kind of epistemic consequentialism. The aforementioned theories do not merely identify some bearer of final epistemic value, but also define one designator of epistemic rightness (for example, justification, rationality, epistemic excellence) in terms of such value.

f. Social Epistemology

Social epistemology is concerned with the way that social institutions, practices, and interactions are related to our epistemic endeavors, such as knowledge generation. Several prominent approaches within social epistemology also seem to be committed to some form of epistemic consequentialism.

Alvin Goldman’s (1999) Knowledge in a Social World is a nice example of social epistemology done with explicit commitments to consequentialism. Goldman writes:

People have interest, both intrinsic and extrinsic, in acquiring knowledge (true belief) and avoiding error. It therefore makes sense to have a discipline that evaluates intellectual practices by their causal contributions to knowledge or error. This is how I conceive of epistemology: as a discipline that evaluates practices along truth-linked (veritistic) dimensions. Social epistemology evaluates specifically social practices along these dimensions. (p. 69)

Goldman’s general approach is to adopt a question-answering model. According to this approach, beliefs in propositions have value or disvalue when those propositions are answers to questions that interest the agent. This suggests that Goldman promotes a view according to which final epistemic value is accuracy with respect to questions of interest, and not mere accuracy alone. As Goldman conceives of it, the epistemic value of believing a true answer to a question of interest is 1, the epistemic value of withholding belief to a true answer is 0.5, and the epistemic value of rejecting a true answer is 0. Goldman extends this to degrees of belief in that natural way: the epistemic value of having a degree of belief x in a true proposition is x. (It is worth noting that this corresponds to a scoring rule that is improper, compare section 3c.) We can then evaluate social practices instrumentally, in terms of their causal contributions to belief states that have final epistemic value. Goldman does this by first specifying the appropriate range of applications for a practice. This will involve actual and possible applications (because some practices do not have an actual track record). Second, one takes the average performance of the practice across these applications. The average performance of a practice determines how it is ranked compared to its competitors. Thus, on this view, it is something like objective expected epistemic value that ranks the various practices.

Consider an example. Goldman argues that civil-law systems are better, from an epistemic perspective, than are common-law systems. The argument for this is complex, but the general structure follows the framework described above. Goldman considers various differences between the two systems, including the numerous exclusionary evidentiary rules in the common-law system as compared to the civil-law system, the large role that adversarial lawyers play in the common-law system as compared to the civil-law system, and the fact that the civil-law system employs trained judges as decision-makers rather than lay jurors. With respect to each of these differences, one can approximate the epistemic value for the relevant decision-makers under each system. For instance, one can estimate how many correct verdicts compared to incorrect verdicts jurors would reach if there were exclusionary evidentiary rules compared to if there were not. On balance, Goldman argues, the civil-law system performs better. For another evaluation of legal structures in consequentialist terms, see Laudan (2006).

Goldman (1999) directs this same style of consequentialist argument toward a variety of social practices, including testimony, argumentation, Internet communication, speech regulation, scientific conventions, law, voting, and education.

Note, however, an important shift in the consequentialist view Goldman defends here compared to earlier theories considered. Previously, the things being evaluated have been belief states or acts of acceptance. Here, Goldman is evaluating social practices and methodologies. We could call the approach in Goldman (1999) an instance of methodological epistemic consequentialism, whereas the former theories are instances of doxastic epistemic consequentialism (note that this terminology is not standard and is introduced simply for clarity within this article).

The basic view can be put into our recipe as follows:

Step 1. Final Value: Accurate beliefs of S in answer to questions that interest S have final epistemic value.

Step 2. Ranking: Social practices are ranked according to the average amount of final epistemic value that they produce across the range of situations they can be applied to.

Step 3. Normative Facts: Social practice A is epistemically better than social practice B just in case A and B are alternatives to each other and A is ranked higher than B in Step 2.

For criticism of Goldman’s social epistemology that focuses specifically on its consequentialist commitments, see DePaul (2004). See also Fallis (2000, 2006).

g. Philosophy of Science

Though Goldman’s work in social epistemology touches on aspects of science, more generally his focus is on social practices. Others are interested in similar questions about social practices, structures, and conventions, but specifically with respect to science. In some of this work, there is a clear foundation of something like epistemic consequentialism.

i. Group versus Individual Rationality

Philip Kitcher (1990) is one of the first to apply formal models to social structures in science to determine the optimal structure for a group of researchers to achieve their scientific goals. The guiding idea behind his work is that if everyone were rational, then they would each make decisions about which projects to explore based on what the evidence supports and there would be a uniformity of practices among scientists. This uniformity would be bad, however, because it would prevent people from pursuing research on new up-and-coming theories (for example, continental drift in the 1920s) as well as on older outgoing theories (for example, phlogiston theory in the 1780s). Kitcher defines two notions: X’s personal epistemic intentions are what X wishes to achieve himself and X’s impersonal epistemic intentions are what X wishes his community to achieve. The question at hand can then be put: how would scientists rationally decide to coordinate their efforts if their decisions were dominated by their impersonal epistemic intentions?

Kitcher formalizes this situation by supposing that there are N researchers working on a particular research question, and each has to determine which research program she will pursue. Define a return function, P_i(n), which represents the chance that program i will be successful given that n researchers are pursuing it. Suppose that each researcher’s personal epistemic intention is to successfully answer the research question. In that case, each researcher will adopt whichever program i has the largest value for P_i(n_i), where n_i is the number of researchers currently pursuing i. However, if we suppose that each researcher’s impersonal epistemic intention is that someone in the community of researchers successfully answers the question, then this way of choosing research programs may not be the way to realize the impersonal epistemic intention. Consider a simple example where there are two research programs, 1 and 2, and N researchers. The best way to achieve the group goal is to maximize P₁(n) + P₂(N–n). But this could be a different distribution than the one that would result were each researcher to be guided by her personal epistemic intention. To see this suppose that there are j researchers in program 1 and k researchers in program 2. It could be that P₁(j+1) > P₂(k+1) and so a new researcher would choose program 1. But for all that, it could be that P₁(j+1) – P₁(j) < P₂(k+1) – P₂(k). That is, the boost in probability of success that program 2 gets from the addition of one more researcher is greater than that of program 1. In that case, it is better for the group for a new researcher to join program 2. Kitcher goes on to argue that certain intuitively unscientific goals such as the goal of fame or popularity could help motivate researchers into a division of labor that helps to reach the impersonal goals rather than the personal goals of each researcher.

Kitcher does not claim that there is one objective answer to what the appropriate epistemic intentions or values are. Nevertheless, there is a consequentialist structure to his argument. Groups of scientists are seen as rational when they choose among options in such a way that they maximize their chance of attaining their epistemic goals. One could question whether this is enough to make the view count as a version of epistemic consequentialism. After all, the options that the agents in Kitcher’s model are choosing between are not beliefs or belief states, but instead decisions about which research program to pursue or about which experiment to run. In this way, Kitcher’s view looks to be an instance of methodological epistemic consequentialism as opposed to doxastic epistemic consequentialism: it is aimed at evaluating actions that are in some way closely related to epistemic ends, rather than at evaluating belief states themselves. Some have argued that approaches such as these do not actually address properly epistemic questions at all. For some thoughts on this, see Christensen (2004, 2007).

Others have followed the general argumentative structure of Kitcher (1990). Zollman (2007, 2010) and Mayo-Wilson, Zollman, and Danks (2011) have focused on the communication networks that might exist between scientists working on the same project. This work reveals some surprising conclusions, in particular, that it might sometimes be epistemically beneficial for the community of scientists to have less than full communication among the members. The basic reason for this is that limiting communication is one way to encourage diversity in research programs, which for Kitcher-like reasons can help the community do better than it otherwise would. Muldoon and Weisberg (2009) and Muldoon (2013) have focused on the kinds of research strategies that individual scientists might have, modeling scientific research as a hill-climbing problem in the computer science literature. They show how it can sometimes be beneficial for the group of scientists to have individuals who are more radical in their exploration strategies.

So far we have surveyed formal models in the philosophy of science literature that seem to take a consequentialist approach to epistemic evaluation. One of the main results of this work is to show how strategies that would be irrational if followed in isolation might yield rational group behavior. Others have emphasized something like this point, but without formal models. Miriam Solomon (1992), for instance, argues for a similar conclusion by drawing on work in psychology and considering the historical data about the shift in geology to accept continental drift. She argues that certain seeming psychological foibles of individual geologists, including cognitive bias and belief preservation, played an important role in the discovery of plate tectonics. Paradoxically, she argues, these attributes that are normally seen as rational failings were in fact conducive to scientific success because they made possible the distribution of research effort. That her work employs a kind of consequentialist picture is evidenced by the fact that she views the central normative question in the philosophy of science to be: “whether or not, and where and where not, our methods are conducive to scientific success…Scientific rationality is thus viewed instrumentally.” (p. 443)

Larry Laudan is another philosopher of science who adopts a generally consequentialist outlook. For Laudan (1984), the things we are ultimately evaluating are methodological rules. Writes Laudan:

… a little reflection makes clear that methodological rules possess what force they have because they are believed to be instruments or means for achieving the aims of science. More generally, both in science and elsewhere, we adopt the procedural and evaluative rules we do because we hold them to be optimal techniques for realizing our cognitive goals or utilities. (1984, p. 26)

There is, on Laudan’s view, not one set of acceptable cognitive goals, although there are ways to rationally challenge the cognitive goals that someone holds. This can be done by either showing that the goals are unrealizable or showing that the goals do not reflect the communal practices that we endorse. On Laudan’s view, then, what has final epistemic value is the realizing of the cognitive goals that we have, so long as these goals are not ruled out in one of the ways above. We can then rank methodological rules, or groups of methodological rules, in virtue of how well they reach those cognitive goals that we have. We then evaluate those rules as rational or not in virtue of this ranking. Laudan does not say that the methodological rules must be optimal, but does suggest, as the quote above notes, that we must think that they are.

ii. Why Gather Evidence?

Another area of philosophy of science that seems committed to epistemic consequentialism concerns the initially odd-sounding question: why should a scientist gather more evidence? On its face, the answer to this question is obvious. But if we idealize scientists as perfectly rational agents, some models of rationality make the question more pressing. For instance, consider an austere version of the Bayesian account of epistemic rationality according to which one is epistemically rational if and only if one’s degrees of belief are probabilistically coherent and one updates one’s beliefs via conditionalization upon receipt of any evidence. An agent can do this perfectly well without ever gathering new evidence. In addition, notice that there is a risk associated with gathering new evidence. Although in the best-case scenario, one acquires information that moves one closer to the truth, it is of course possible that one gets misleading evidence and so is pushed further from the truth. Is there anything that can be said in defense of the intuitive verdict that despite this, it is still rational to gather evidence?

An early answer to this question is provided by I. J. Good (1967). Suppose that you are going to have to make a decision and you can perform an experiment first and then make the decision or you can simply make the decision. Good shows that if you choose by maximizing subjective expected value, if there is no cost of performing the experiment, and if several other constraints are imposed, then the subjective expected value of your choice is always at least as great after performing the experiment as before. Here then we have an argument in favor of a certain sort of epistemic behavior—gathering evidence—that is consequentialist at heart. It says that if you do this sort of thing, you can expect to make better choices. However, it is not clear that this is an epistemic consequentialist argument. At best, it suggests that experimenting is pragmatically rational. To drive this point home, note that it seems there are experiments that are epistemically rational to perform even if there is no reason to expect that any decision we will make depends on the outcome.

Others, however, have attempted to extend the basic Good result to scenarios where only final epistemic value is at issue. Oddie (1997), for instance, shows that if one uses a proper scoring rule to measure accuracy and if one updates via conditionalization, then the expected final epistemic value of learning information from a partition is always at least as great as refusing to learn the information. Myrvold (2012) generalizes this basic result and shows that something similar holds even if we do not require that one updates via conditionalization. Instead, so long as one satisfies Bas van Fraassen’s (1984) reflection principle, then something similar to Oddie’s result holds. For commentary on van Fraassen’s reflection principle, see Maher (1992). For other work on the issue of gathering evidence, see Maher (1990) and Fallis (2007).

Work in this area seems clearly committed to an especially veritistic form of epistemic consequentialism. Here we have an argument in favor of acquiring new evidence (if it is available) that appeals solely to the increase in accuracy one can expect to get from such evidence. As Oddie (1997, p. 537) writes: “The idea that a cognitive state has a value which is completely independent of where the truth lies is just bizarre. Truth is the aim of inquiry.”

4. Summing Up: Some Useful Distinctions

Now that we have surveyed a variety of theories that seem to have some commitment to epistemic consequentialism, it is useful to remind ourselves of two important distinctions relevant to categorizing different species of epistemic consequentialism.

First, some of the theories discussed above are committed to restricted consequentialism. According to these views, the normative facts about Xs are determined by some restricted set of the consequences of the Xs. More precisely, consider a theory that will issue normative verdicts about some belief b. A restricted consequentialist view maintains that something has final epistemic value, but that the normative facts about b are not determined by the amount of final epistemic value contained in the entire set of b’s causal consequences. In the limit, none of the causal consequences of b are relevant; only the final epistemic value contained in b itself is relevant. For instance, Feldman’s view about justification, Foley’s view about rationality, the approach of cognitive decision theory, and some versions of the accuracy-first program appear to be restricted consequentialist views in this limiting sense. Feldman, recall, explicitly states that the causal consequences of adopting a belief are irrelevant to its justificatory status; Foley focuses on the goal of now believing the truth and not now believing falsely, so excludes causal consequences; and Joyce’s accuracy-first program looks at whether some doxastic state dominates another doxastic state when the states are looked at for their accuracy now. Reliabilism is arguably also a form of restricted consequentialism, because the causal consequences of the belief itself are not relevant to its normative status; rather, it is the status of the particular process of belief formation that led to the belief that is relevant to the belief’s normative status. A process of belief formation earns its status, in turn, in terms of the proportion of true beliefs that it directly produces, so not even the total consequences of a belief-forming process are relevant according to the reliabilist.

Unrestricted consequentialist views, on the other hand, are those according to which the normative facts about whatever is being evaluated are determined by the amount of final epistemic value in the entire set of that thing’s causal consequences. It is unclear whether we have seen any wholly unrestricted consequentialist views in this sense, although Goldman’s approach to social epistemology and Kitcher’s approach to the distribution of cognitive labor may come close.

It is something of an open question whether a restricted consequentialism is genuinely a form of consequentialism. Some discussions of consequentialism in ethics suggest that restricted versions of consequentialism are not genuinely instances of consequentialism (see, for instance, Pettit (1988), Portmore (2007), Smith (2009), and Brown (2011)). Klausen (2009) argues that restricted versions of consequentialism are not genuinely instances of consequentialism, specifically with respect to epistemology.

The second important distinction to keep in mind when categorizing species of epistemic consequentialism is a distinction between those theories that seek to evaluate belief states and those that seek to evaluate some sort of action of some epistemic relevance. An example will make this distinction clearer. The accuracy-first program seeks to evaluate belief states based solely on their accuracy. Kitcher’s approach to the distribution of cognitive labor seeks to evaluate the decisions of scientists to engage in certain lines of research based on the ultimate payoff in terms of true belief for the scientific community. As noted above, we could call the first approach an instance of doxastic epistemic consequentialism and the second sort of approach an instance of methodological epistemic consequentialism (again, note that these terms are not established in the literature). With this distinction in hand, we can sort some of the theories above along this dimension. Attempts to explain why it is rational to gather evidence, much of social epistemology, and the work on communication structures and exploration strategies among scientists are instances of methodological epistemic consequentialism. Consequentialist analyses of justification, cognitive decision theory, and the accuracy-first program are instances of doxastic epistemic consequentialism.

5. Objections to Epistemic Consequentialism

Theories committed to some form of epistemic consequentialism will have specific objections that can be lodged against them. Here we will focus on general objections to the fundamental idea behind epistemic consequentialism.

a. Epistemic Trade-Offs

Epistemic consequentialists maintain that, in some way, the right option is one that is conducive to whatever has final epistemic value. Say that you accept a trade-off if you sacrifice something of value for even more of what is valuable. Thus, if true belief has final epistemic value (and if each true belief has equal final epistemic value), you accept a trade-off when you sacrifice a true belief concerning p for two true beliefs about q and r. It is hard to see how one can hold a consequentialist view and not think that it is at least sometimes permissible to accept trade-offs. For then it would seem that rightness is no longer being understood in terms of conduciveness to what has value (though, as we will see, restricted consequentialists of a certain sort may be able to deny this).

The permissibility of accepting trade-offs, however, constitutes a problem for epistemic consequentialism. If one thinks about consequentialist theories in ethics, this is not so surprising. Some of the strongest intuitive objections to consequentialist moral theories are those that focus on trade-offs. Consider, for instance, the organ harvest counterexample to utilitarianism (Thomson 1985). In that scenario, a doctor has five patients all in dire need of a different organ transplant. The doctor also has a healthy patient who is a potential donor for each of the five patients. Because it is a consequentialist moral theory and endorses trade-offs, it seems that utilitarianism says the doctor is required to sacrifice the one to save the five. But, it is alleged, this flies in the face of common sense, and so we have a challenge for utilitarianism.

Trade-off objections to epistemic consequentialism (structurally similar to the organ harvest) have been made explicitly by Firth (1981), Jenkins (2007), Littlejohn (2012), Berker(2013a,b), and Greaves (2013). And one can see hints of such an objection in Fitelson and Easwaran (2012) and Caie (2013).

The basic objection starts with the observation that a belief can be justified or rational or epistemically appropriate (or whatever other term for epistemic rightness one prefers) even if adopting that belief causes some epistemic catastrophe. Similarly, it seems that a belief can be unjustified or irrational or epistemically inappropriate even if adopting that belief results causally in some epistemic reward. For an example of the first sort, S might have significant evidence that he is an excellent judge of character and so S believing this about himself might be justified for S. But it could be that this belief serves to make S overconfident in other areas of his life and so S ends up misreading evidence quite badly in the long run. For an example of the second sort, S might have no evidence that God exists, but believe it anyway to make it more likely that S receives a large grant from a religiously affiliated (and unscrupulous) funding agency. The grant will allow S to believe many more true and interesting propositions than otherwise (the example is due to Fumerton (1995), p. 12). These kinds of examples seem to show that epistemic rightness cannot be understood in terms of conduciveness to what has epistemic final value.

There are two main responses that the epistemic consequentialist can make to the trade-off objection, and each comes with a challenge. The first response is to maintain that, appearances to the contrary, there are versions of epistemic consequentialism that do not sanction unintuitive trade-offs. For a response in this vein, see Ahlstrom-Vij and Dunn (2014). In ethics, some who think of themselves as consequentialists respond to analogous objections by introducing agent-relative values (see, for instance, Sen (1982) and Broome (1995)). The basic idea is that we can have agent-relative values in the outcomes of states, which allows, for example, for agent S to value the state where S breaks no promises more than someone else values that same state. This allows for one to give a consequentialist-based evaluation of rightness that does not always require one to say that it is right for S to break a promise in order to ensure that two others do not break their promises. It is not clear how such a modification of consequentialism would best carry over to epistemic consequentialism, but it could represent a way of making this first response. The challenge for any response in this vein is to explain how such views are genuinely an instance of epistemic consequentialism.

The second response to trade-off objections is to maintain that while epistemic consequentialism does sanction trade-offs, we can explain away the felt unintuitiveness of such verdicts. The challenge for this second response is to actually give such an explanation.

b. Positive Epistemic Duties

When it comes to moral obligation, it seems plausible that we sometimes have obligations to take certain actions and sometimes have obligations to refrain from certain actions. It is then natural to distinguish between positive duties—say, the obligation to take care of my children—and negative duties—say, the obligation to not steal from others. Consider how a similar distinction would be drawn in epistemology. Obligations to believe certain propositions would correspond to positive epistemic duties, while obligations to refrain from believing certain propositions would correspond to negative epistemic duties.

Littlejohn (2012) has argued that certain forms of epistemic consequentialism look as though they will naturally lead to positive epistemic duties. Suppose, as certain doxastic epistemic consequentialists will maintain, that whether we are obligated to believe or refrain from believing a proposition is a function of the final epistemic value of believing or refraining from believing that proposition. And suppose that the consequentialist also maintains that we have some negative epistemic duties; that is, there are situations where one is epistemically obligated to refrain from believing a proposition. The consequences of refraining in such a situation will have some level of epistemic value. But it seems that we can surely find a situation where believing a proposition has consequences with equal epistemic value. Thus, it looks as though the consequentialist is committed to saying that there are positive epistemic duties: sometimes we are obligated to believe propositions.

However, some epistemologists hold that we have no positive epistemic duties. We may be obligated to refrain from believing certain things, but we have no duties to believe. Nelson (2010) provides one argument for this claim. He argues that if we had positive epistemic duties, we would have to believe each proposition that our evidence supported. But this means we would be epistemically obligated to believe infinitely many propositions, as Nelson argues that any bit of evidence supports infinitely many propositions. As we cannot believe infinitely many propositions, Nelson holds that we have no positive epistemic duties.

The thesis that there are no positive epistemic duties is controversial, as is Nelson’s argument for that claim. Nevertheless, this presents a potential worry for certain versions of epistemic consequentialism. It is perhaps worth noting that this sort of objection to epistemic consequentialism is in some ways analogous to objections that maintain that consequentialist views in ethics are overly demanding. For more on the issue of positive epistemic duties, see Stapleford (2013) and the discussion in Littlejohn (2012, ch. 2).

c. Lottery Beliefs

Suppose that you know there is a lottery with 10,000 tickets, each with an equal chance of winning, but where only one ticket will win. Consider the proposition that ticket 1437 will lose. It is incredibly likely that this proposition is true, and the same is true for each of the n propositions that say that ticket n will lose. Nevertheless, a number of epistemologists maintain that one is not justified in believing such lottery propositions (for instance, BonJour (1980), Pollock (1995), Evnine (1999), Nelkin (2000), Adler (2005), Douven (2006), Kvanvig (2009), Nagel (2011), Littlejohn (2012), Smithies (2012), McKinnon (2013), and Locke (2014)).

Some consequentialist approaches to justification, however, look as though they will say that one is justified in believing such lottery propositions. For instance, suppose that there is a process of belief formation that issues beliefs of the form ticket n is a loser. This process is highly reliable and so beliefs produced by it are justified according to one version of reliabilism about justification. Some process reliabilists about justification might maintain that there is no such process in an attempt to avoid this implication of their view. However, as Selim Berker (2013b) has noted, the very structure of consequentialist views in epistemology looks as though there will be some case that can be brought against the consequentialist where some set of beliefs are justified purely in virtue of statistical information about the relative lack of falsehoods in a set of propositions.

Again, not all maintain that there is no justification to be had in such cases; some maintain that while such lottery propositions cannot be known, they nevertheless can be justified. But there are a number of epistemologists who maintain such a view and so we again have a potential worry here for the consequentialist. For a response to this worry, see Ahlstrom-Vij and Dunn (2014).

6. References and Further Reading

Adler, J. (2005) ‘Reliabilist Justification (or Knowledge) as a Good Truth-Ratio’ Pacific Philosophical Quarterly 86: 445–458.
Ahlstrom-Vij, K. and Dunn, J. (2014) ‘A Defence of Epistemic Consequentialism’ Philosophical Quarterly 64: 541–551.
Angere, S. (2007) ‘The Defeasible Nature of Coherentist Justification’ Synthese 157: 321–335.
Berker, S. (2013a) ‘Epistemic Teleology and the Separateness of Propositions’ The Philosophical Review 122: 337–393.
Berker, S. (2013b) ‘The Rejection of Epistemic Consequentialism’ Philosophical Issues 23: 363–387.
Bishop, M. and Trout, J. D. (2005) Epistemology and the Psychology of Human Judgment. Oxford: Oxford University Press.
BonJour, L. (1980) ‘Externalist Theories of Empirical Knowledge’ Midwest Studies in Philosophy 5: 53–74.
BonJour, L. (1985) The Structure of Empirical Knowledge. Cambridge, MA: Harvard University Press.
Bovens, L., and Hartmann, S. (2003) Bayesian Epistemology. Oxford: Oxford University Press.
Broome, J. (1991) Weighing Goods: Equality, Uncertainty and Time. Oxford: Wiley-Blackwell.
Brown, C. (2011) ‘Consequentialize This’ Ethics 121: 749–771.
Caie, M. (2013) ‘Rational Probabilistic Incoherence’ Philosophical Review 122: 527–575.
Christensen, D. (2004) Putting Logic in Its Place. Oxford: Oxford University Press.
Christensen, D. (2007) ‘Epistemology of Disagreement: The Good News’ Philosophical Review 116: 187–217.
Conee, E. (1992) ‘The Truth Connection’ Philosophy and Phenomenological Research 52: 657–669.
Conee, E. and Feldman, R. (2008) ‘Evidence’ In Q. Smith (Ed.), Epistemology: New Essays. Oxford: Oxford University Press: 83–104.
DePaul, M. (2004) ‘Truth Consequentialism, Withholding and Proportioning Belief to the Evidence’ Philosophical Issues 14: 91–112.
Douglas, H. (2000) ‘Inductive Risk and Values in Science’ Philosophy of Science 67: 559–579.
Douglas, H. (2009) Science, Policy, and the Value-Free Ideal. Pittsburgh, PA: University of Pittsburgh Press.
Douven, I. (2006) ‘Assertion, Knowledge, and Rational Credibility’ Philosophical Review 115: 449–485.
Easwaran, K. and Fitelson, B. (2012) ‘An “Evidentialist” Worry about Joyce’s Argument for Probabilism’ Dialectica 66: 425–433.
Easwaran, K. and Fitelson, B. (2015) ‘Accuracy, Coherence, and Evidence’ In T. Szabo Gendler and J. Hawthorne (Eds.), Oxford Studies in Epistemology, Volume 5. Oxford: Oxford University Press.
Evnine, S. (1999) ‘Believing Conjunctions’ Synthese 118: 201–227.
Fallis, D. (2000) ‘Veritistic Social Epistemology and Information Science’ Social Epistemology 14: 305–316.
Fallis, D. (2006) ‘Epistemic Value Theory and Social Epistemology’ Episteme 2: 177–188.
Fallis, D. (2007) ‘Attitudes Toward Epistemic Risk and the Value of Experiments’ Studia Logica 86: 215–246.
Feldman, R. (1998) ‘Epistemic Obligations’ Philosophy Perspectives 2: 236–256.
Feldman, R. (2000) ‘The Ethics of Belief’ Philosophy and Phenomenological Research 60: 667–695.
Feldman, R. and Conee, E. (1985) ‘Evidentialism’ Philosophical Studies 48: 15–34.
Firth, R. (1981) ‘Epistemic Merit, Intrinsic and Instrumental’ Proceedings and Addresses of the American Philosophical Association 55: 5–23.
Foley, R. (1987) The Theory of Epistemic Rationality. Cambridge, MA: Harvard University Press.
Fumerton, R. (1995) Metaepistemology and Skepticism. Lanham, MD: Rowman & Littlefield.
Goldman, A. (1979) ‘What Is Justified Belief?’ In G. Pappas (Ed.), Justification and Knowledge. Springer: 1–23.
Goldman, A. (1986) Epistemology and Cognition. Cambridge, MA: Harvard University Press.
Goldman, A. (1999) Knowledge in a Social World. Oxford: Oxford University Press.
Good, I. J. (1967) ‘On the Principle of Total Evidence’ British Journal for the Philosophy of Science 17: 319–321.
Greaves, H. (2013) ‘Epistemic Decision Theory’ Mind 122: 915–952.
Greaves, H. and Wallace, D. (2006) ‘Justifying Conditionalization: Conditionalization Maximizes Expected Epistemic Utility’ Mind 115: 607–632.
Haddock, A., Millar, A., and Pritchard, D. (2009) Epistemic Value (Eds) Oxford: Oxford University Press.
Harman, G. (1988) Change in View. Cambridge, MA: MIT Press.
Hempel, C. (1960) ‘Inductive Inconsistencies.’ Synthese 12: 439–469.
Huemer, M. (2011) ‘Does Probability Theory Refute Coherentism?’ Journal of Philosophy 108: 35–54.
Jenkins, C. S. (2007) ‘Entitlement and Rationality’ Synthese 157: 25–45.
Joyce, J. (1998) ‘A Nonpragmatic Vindication of Probabilism.’ Philosophy of Science 65: 575–603.
Joyce, J. (2009) ‘Accuracy and Coherence: Prospects for an Alethic Epistemology of Partial Belief’ In Huber and Schmidt-Petri (Eds.) Degrees of Belief. Springer: 263–300.
Kagan, S. (1997) Normative Ethics. Boulder, CO: Westview Press.
Klausen, S. H. (2009) ‘Two Notions of Epistemic Normativity’ Theoria 75: 161–178.
Klein, P. and Warfield, T. A. (1994) ‘What Price Coherence?’ Analysis 54: 129–132.
Kitcher, P. (1990) ‘The Division of Cognitive Labor’ The Journal of Philosophy 87: 5–22.
Kvanvig, J. (2003) The Value of Knowledge and the Pursuit of Understanding. Cambridge: Cambridge University Press.
Kvanvig, J. (2009) ‘Assertion, Knowledge and Lotteries’ In Greenough and Pritchard (Eds.), Williamson on Knowledge. Oxford: Oxford University Press: 140–160.
Laudan, L. (1984) Science and Values. Berkeley: University of California Press.
Laudan, L. (2006) Truth, Error, and Criminal Law. Cambridge: Cambridge University Press.
Leitgeb, H. and Pettigrew, R. (2010a) ‘An Objective Justification of Bayesianism I: Measuring Inaccuracy’ Philosophy of Science 77: 201–235.
Leitgeb, H. and Pettigrew, R. (2010b) ‘An Objective Justification of Bayesianism II: The Consequences of Minimizing Inaccuracy’ Philosophy of Science 77: 236–272.
Levi, I. (1967) Gambling with Truth. Cambridge, MA: MIT Press.
Littlejohn, C. (2012) Justification and the Truth Connection. Cambridge: Cambridge University Press.
Locke, D. T. (2014) ‘The Decision-Theoretic Lockean Thesis’ Inquiry 57: 28–54.
Maher, P. (1990) ‘Why Scientists Gather Evidence’ British Journal for the Philosophy of Science 41: 103–119.
Maher, P. (1992) ‘Diachronic Rationality’ Philosophy of Science 59: 120–141.
Maher, P. (1993) Betting on Theories. Cambridge: Cambridge University Press.
Maitzen, S. (1995) ‘Our Errant Epistemic Aim’ Philosophy and Phenomenological Research 55: 869–876.
Mayo-Wilson, C., Zollman, K. J., and Danks, D. (2011) ‘The Independence Thesis: When Individual and Social Epistemology Diverge’ Philosophy of Science 78: 653–677.
McKinnon, R. (2013) ‘Lotteries, Knowledge, and Irrelevant Alternatives’ Dialogue 52: 523–549.
McNaughton, D. and Rawling, P. (1991) ‘Agent-Relativity and the Doing-Happening Distinction’ Philosophical Studies 63: 163–185.
Muldoon, R. (2013) ‘Diversity and the Division of Cognitive Labor’ Philosophy Compass 8: 117–125.
Muldoon, R. and Weisberg, M. (2009) ‘Epistemic Landscapes and the Division of Cognitive Labor’ Philosophy of Science 76: 225–252.
Myrvold, W. (2012) ‘Epistemic Values and the Value of Learning’ Synthese 187: 547–568.
Nagel, J. (2011) ‘The Psychological Basis of the Harman-Vogel Paradox’ Philosophers’ Imprint 11: 1–28.
Nagel, T. (1986) The View from Nowhere. Oxford: Oxford University Press.
Nelkin, D. K. (2000) ‘The Lottery Paradox, Knowledge, and Rationality’ Philosophical Review 109: 373–409.
Nelson, M. (2010) ‘We Have No Positive Epistemic Duties’ Mind 119: 83–102.
Oddie, G. (1997) ‘Conditionalization, Cogency, and Cognitive Value’ British Journal for the Philosophy of Science 48: 533–541.
Nozick, R. (1974) Anarchy, State, and Utopia. New York: Basic Books.
Olsson, E. J. (2005) Against Coherence: Truth, Probability, and Justification. Oxford: Oxford University Press.
Percival, P. (2002) ‘Epistemic Consequentialism’ Proceedings of the Aristotelian Society Supplementary Volume 76: 121–151.
Pettigrew, R. (2012) ‘Accuracy, Chance, and the Principal Principle’ Philosophical Review 121: 241–275.
Pettigrew, R. (2013a) ‘A New Epistemic Utility Argument for the Principal Principle’ Episteme 10: 19–35.
Pettigrew, R. (2013b) ‘Accuracy and Evidence’ Dialectica 67: 579–596.
Pettigrew, R. (2013c) ‘Epistemic Utility and Norms for Credences.’ Philosophy Compass 8: 897–908.
Pettigrew, R. (2015) ‘Accuracy and the Belief-Credence Connection’ Philosophers’ Imprint. 15: 1–20.
Pettit, P. (2000) ‘Non-consequentialism and Universalizability’ The Philosophical Quarterly 50: 175–190.
Pettit, P. (1988) ‘The Consequentialist Can Recognise Rights’ The Philosophical Quarterly 38: 42–55.
Pollock, J. (1995) Cognitive Carpentry. Cambridge, MA: MIT Press.
Portmore, D. (2007) ‘Consequentializing Moral Theories’ Pacific Philosophical Quarterly 88: 39–73.
Pritchard, D., Millar, A., and Haddock, A. (2010) The Nature and Value of Knowledge: Three Investigations. Oxford: Oxford University Press.
Sen, A. (1982) ‘Rights and Agency’ Philosophy & Public Affairs 11: 3–39.
Smart, J. J. C. and Williams, B. (1973) Utilitarianism: For and Against. Cambridge: Cambridge University Press.
Smith, M. (2009) ‘Two Kinds of Consequentialism’ Philosophical Issues 19: 257–272.
Smithies, D. (2012) ‘The Normative Role of Knowledge’ Nous 46: 265–288.
Solomon, M. (1992) ‘Scientific Rationality and Human Reasoning’ Philosophy of Science 59: 439–455.
Stalnaker, R. (2002) ‘Epistemic Consequentialism’ Proceedings of the Aristotelian Society Supplementary Volume 76: 152–168.
Stapleford, S. (2013) ‘Imperfect Epistemic Duties and the Justificational Fecundity of Evidence’ Synthese 190: 4065–4075.
Stich, S. (1990) The Fragmentation of Reason. Cambridge, MA: MIT Press.
Thomson, J. J. (1985) ‘The Trolley Problem’ The Yale Law Journal 94: 1395–1415.
van Fraassen, B. (1984) ‘Belief and the Will’ The Journal of Philosophy 81: 235–256.
Whitcomb, D. (2007) An Epistemic Value Theory. (Doctoral dissertation) Retrieved from Rutgers University Community Repository at: http://dx.doi.org/doi:10.7282/T3ZP46HD
Williams, J. R. G. (2012) ‘Gradational Accuracy and Nonclassical Semantics’ The Review of Symbolic Logic 5: 513–537.
Zagzebski, L. (2003) ‘Intellectual Motivation and the Good of Truth’ In Zagzebski, L. and DePaul, M. (Eds.) Intellectual Virtue: Perspectives from Ethics and Epistemology. Oxford University Press: 135–154.
Zollman, K. J. (2007) ‘The Communication Structure of Epistemic Communities’ Philosophy of Science 74: 574–587.
Zollman, K. J. (2010) ‘The Epistemic Benefit of Transient Diversity’ Erkenntnis 72: 17–35.

Author Information

Jeffrey Dunn
Email: jeffreydunn@depauw.edu
DePauw University
U. S. A.

Benedict De Spinoza: Moral Philosophy

Like many European philosophers in the early modern period, Benedict de Spinoza (1632-1677) developed a moral philosophy that fused the insights of ancient theories of virtue with a modern conception of humans, their place in nature, and their relationship to God. Unlike many other authors in this period, however, Spinoza was strongly opposed to anthropocentrism and had no commitment whatsoever to traditional theological views. His unique metaphysics motivated an intriguing moral philosophy. Spinoza was a moral anti-realist, in that he denied that anything is good or bad independently of human desires and beliefs. He also endorsed a version of ethical egoism, according to which everyone ought to seek their own advantage; and, just as it did for Thomas Hobbes, this in turn led him to develop a version of contractarianism. However, Spinoza’s versions of each of these views, and the way in which he reconciles them with one another, are influenced in fascinating ways by his very unorthodox metaphysical picture.

The topics mentioned so far can be related comfortably to twenty-first century debates in moral philosophy. Yet Spinoza was also very interested in another issue that is moral only in the more archaic sense that it pertains to the good life: namely, the means by which humans may (to some extent) achieve mastery over their passions. Though this topic was of central importance to Spinoza, the pride of place he awarded it in his Ethics reflects the fact that seventeenth-century conceptions of moral philosophy were, in subtle but important ways, different than our own.

Guiding Metaphysical Principles
Moral Philosophy in Spinoza’s System
Spinoza’s Remedies for the Passions
Conclusion
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. Guiding Metaphysical Principles

The name of Spinoza’s most famous work is the Ethics, but he does not really broach the topic of ethics until part four of the five-part work. The reason for this is that although his aim is to set forth “the right way of living” (E4app, G II/266) and to explain “what freedom of mind, or blessedness, is” (E5pref, G II/277), his accounts of these things depend upon certain key metaphysical principles that he feels must be established first.

This article provides only brief explanations of the relevant principles. For more detailed discussions of each of them, see the main article on Spinoza.

a. Substance Monism

In Cartesian philosophy, a substance is something that does not depend for its existence on anything else—or, in the case of created substances, anything other than God (CSM I, 210). A mode is something that is not a substance (for instance, a property, quality, or attribute). Descartes appears to take the human body and mind to be paradigmatic substances, and the extended properties and thoughts of the body and mind (respectively) to be paradigmatic modes. Spinoza was critical of Descartes for giving a non-univocal definition of the term ‘substance,’ so that the predicate means something different when applied to God than when applied to a human. Spinoza’s alternative approach was to stick to the most general definition: a substance is something that is “in itself and is conceived through itself, that is, that whose concept does not require the concept of another thing, from which it must be formed” (E1d3).

In defining a substance this way, Spinoza avoids the equivocation involved in the Cartesian conception of substances. However, he also quickly concludes that given this definition, humans are not substances. Indeed, Spinoza argues, there can be only one substance, God (E1p14), and everything else is merely a mode of God (E1p15). As a result, Spinoza conceives of God as a being that is absolute and perfect by its very nature; humans, by contrast, are dependent and imperfect by their very nature.

b. Necessitarianism

Although ordinarily we speak as though things could have been different than they in fact are—you could have turned left rather than right, the election might have gone differently, and so on—Spinoza denies that these alternative scenarios are genuinely possible. He provides several different arguments for this conclusion, but perhaps the simplest is based upon the thought that, since the world is a mode of God, and God could not be different than it is, it follows that “Things could have been produced by God in no other way, and in no other order than they have been produced” (E1p33). This divine necessitarianism trickles down: humans, too, could not have acted otherwise than they did. The fact that we ordinarily believe ourselves capable of acting otherwise is an illusion produced by our ignorance of both the physical and psychological forces influencing us, as well as of our own nature (E3p2s).

c. The Conatus Doctrine

Perhaps the most important metaphysical principle involved in Spinoza’s ethical theory is his view that “Each thing, as far as it can by its own power, strives to persevere in its being” (E3p6). The interpretation of this principle is the source of much scholarly disagreement, but a few things are clear. The striving [conatus] at issue is not to be confused with conscious effort, since Spinoza takes the principle to govern bodies as well as minds. Nor is the conatus to be confused with the metabolic processes of a living organism, since Spinoza takes the principle to govern (what we ordinarily consider to be) non-living things as well as living ones. Spinoza is making the metaphysical claim that each thing is possessed of an inner force, by which it continuously reasserts its own existence.

This doctrine is particularly important for understanding Spinoza’s moral theory, since Spinoza accepts psychological egoism on the basis of it: “When this striving is related only to the mind, it is called will; but when it is related to the mind and body together, it is called appetite. This appetite…is the very essence of man, from whose nature there necessarily follow those things that promote his preservation” (E3p9s).

d. Activity and Passivity

In transitioning from his metaphysics to his moral theory, Spinoza relies heavily upon two concepts, activity and passivity, that come to take the place of traditional axiological concepts like good and evil. Something is active insofar as it produces various effects through its striving; conversely, it is passive insofar as it and its states are produced by external causes (E3d1–3). Both activity and passivity are treated as matters of degree. Thus God, the total cause of all things, is active in the highest degree and not at all passive, while humans (since they are not substances) are always partly active and partly passive, causally dependent upon God as well as upon other modes.

With respect to the human mind, activity takes the form of rational or adequate cognition (E3p1). Actions of the mind are adequate ideas, which increase its power of acting, while passions of the mind are inadequate, confused ideas, which decrease its power of acting. Spinoza’s conception of passions is quite general, so, for example, what we would call a “dispassionate” state of melancholy could for him qualify as a powerful passion because of how much it diminishes our activity. This should be borne in mind when we turn, in section 3, to considering Spinoza’s account of how to overcome our passions.

2. Moral Philosophy in Spinoza’s System

a. Spinoza’s Metaethics: Moral Anti-Realism

Spinoza’s metaphysical views quickly commit him to a version of moral anti-realism. A moral realist holds that at least some things are good or bad independently of what we desire or believe to be the case. Spinoza, in numerous passages in the Ethics and earlier works, denies that there are any such moral qualities. His rejection of moral realism is tied up with his rejection of teleological explanations of nature, for he sees the attribution of qualities like goodness or perfection as an error that is based upon the false belief that nature was designed by God with humanity in mind. Spinoza explains, “After men persuaded themselves that everything which happens, happens on their account, they had to judge that what is most important in each thing is what is most useful to them… Hence, they had to form these notions, by which they explained natural things: good, evil, order, confusion, warm, cold, beauty, ugliness” (E1app, G II/82). This family of concepts, which includes moral and aesthetic concepts along with concepts of sensible qualities, Spinoza holds to be produced by the imagination rather than reason. Hence the concepts “by which ordinary people are accustomed to explain Nature…do not indicate the nature of anything, only the constitution of the imagination” (E1app, G II/83).

In addition to providing etiological accounts intended to explain why people make the mistake of treating moral qualities as objective (and thereby to undermine the belief that they are objective), Spinoza develops two distinct arguments for his anti-realism. His first argument for anti-realism is that if moral qualities like evil or imperfection were objective, then it would be conceivable “that Nature sometimes fails or sins, and produces imperfect things” (E4pref, G II/207). But this is inconceivable: such a possibility supposes that there is a goal or standard that nature has fallen short of, yet there is no such goal or standard: “The reason why…God, or Nature, acts and the reason why it exists, are one and the same. As it exists for the sake of no end, it also acts for the sake of no end” (ibid). Again, just as in his earlier discussion, Spinoza’s denial of the objectivity of moral qualities is based upon his rejection of natural teleology. The rejection of natural teleology, in turn, is based upon his substance monism and necessitarianism: “all things follow from the necessity of the divine nature, and hence…whatever seems immoral, dreadful, unjust, and dishonorable, arises from the fact that [we conceive] the things themselves in a way which is disordered, mutilated, and confused” (E4p73s).

It is worth mentioning a second argument that comes shortly after, but appears to have very different motivations: “As far as good and evil are concerned, they also indicate nothing positive in things, considered in themselves… For one and the same thing can, at the same time, be good, and bad, and also indifferent. For example, music is good for one who is melancholy, bad for one who is mourning, and neither good nor bad to one who is deaf” (E4pref, G II/208). If moral qualities were objective, then nothing could have contrary moral qualities at one and the same time. But many things do have contrary moral qualities at one and the same time, with respect to different observers. Therefore, moral qualities are not objective, in the sense that they “indicate nothing positive in things, considered in themselves” (ibid). This argument is quite different than the previous one. The first argument draws out the a priori incoherence that would be involved in the very idea of objective moral qualities, while the second is based upon the empirical premise that different people may judge a thing to have contrary moral qualities. It is an ancestor of the argument from disagreement often used to defend moral relativism.

In spite of the fact that Spinoza rejects moral realism, he does not advocate for the elimination of moral language. To see why, consider an advantage that the moral realist seems to have over Spinoza’s anti-realism. The moral realist, as Spinoza sees it, holds that in cases of moral judgment, we first recognize something to be good (for example), and then this results in our forming a desire for that thing. Though Spinoza rejects this account of moral judgment, one of its benefits is that it allows us to distinguish between what is desired and what is genuinely desirable. Since it often happens that a person wants something and later discovers it really to be undesirable — or even wants something in spite of the fact that he knows it to be undesirable — the distinction is an important one to preserve. For example, we want to be able to make sense of the fact that although someone wants to commit suicide, this is not really desirable; the moral realist’s picture gives us a way to do this by distinguishing the (true) claim that this person desires to commit suicide from the (false) claim that it is good/desirable for this person to commit suicide.

Yet Spinoza thinks the moral realist’s story is exactly backwards: “we neither strive for, nor will, neither want, nor desire anything because we judge it to be good; on the contrary, we judge something to be good because we strive for it, will it, want it, and desire it” (E3p9s; cf. 3p39s). He thus subscribes to a desire-satisfaction theory of value: what is ultimately of value is the satisfaction of desire; things become valuable only by virtue of their being desired, or their serving to satisfy some desire. (For more on this, see Youpa [2010, 209, fn. 1], and Lebuffe [2010, 152–9].) So it may seem that Spinoza will have a problem making the distinction between what we think is good and what is genuinely good for us.

Spinoza agrees that we need this distinction, but holds that our judgments about what is genuinely good for us are based upon an “idea of man” we have formed “as a model of human nature” (E4pref, G II/208). To hold on to the distinction between what a person desires and what is genuinely desirable, then, Spinoza wants to preserve our ordinary talk of good and evil, with the caveat that such talk refers only to the relation between ourselves and an idealized model human (Curley [1979, 356–62], Nadler [2006, 215–9], and Hübner [2014, 136–140]). Hence, Spinoza writes, “I shall understand by good what we know certainly is a means by which we may approach nearer and nearer to the model of human nature we set before ourselves. By evil, what we certainly know prevents us from becoming like that model” (ibid). Since the model is an idealization, the judgment that something is good or evil does not involve any commitment to objective, mind-independent qualities of goodness or evilness. Yet having such a model is useful, since it allows us to make judgments about what will be good or bad for us as distinct from what we presently happen to desire.

b. Spinoza’s Ethics: Ethical Egoism, Contractarianism, and Virtue Theory

The previous section established that Spinoza is a moral anti-realist in the sense that he denies that there exist mind-independent moral properties. Nevertheless, on most readings of the Ethics, Spinoza is also an ethical egoist, since he holds that reason “demands that everyone love himself, seek his own advantage…and absolutely, that everyone should strive to preserve his own being as far as he can” (E4p18s; see also TTP Ch. 16, 175). These two views are compatible, however, since Spinoza’s approach to developing his positive moral theory is to reduce normative claims to considerations of self-interest in a manner reminiscent of Hobbes (Curley 1988, 119–124). Perhaps the major difference between the Spinozist and the Hobbist approaches to egoism is that Spinoza provides a metaphysical argument for the view, in contrast to Hobbes’ psychological argument. Specifically, Spinoza bases his ethical egoism upon his conatus doctrine.

Spinoza’s initial argument for the claim that reason demands that everyone seek his own advantage is brief: “Since reason demands nothing contrary to Nature, it demands that everyone…seek his own advantage… This, indeed, is as necessarily true as that the whole is greater than its part” (E4p18s). Breaking the argument down:

Reason demands nothing contrary to Nature.
It is contrary to Nature for someone not to seek his own advantage.
So, reason demands that everyone seek his own advantage.

Both premises hinge upon what is meant by the claim that something is “contrary to Nature.” By this, Spinoza seems to mean something impossible, something that cannot be, by virtue of incompatibility with either the laws of logic or of nature. In this interpretation, premise (1) is Spinoza’s nod to the commonly held principle that ought implies can: you can be morally bound to do only something that you are able to do. More importantly, given this interpretation, the second premise comes out as a conceptual truth grounded in part of the conatus doctrine.

In E3p4, which he references in his argument for egoism, Spinoza argued, “No thing can be destroyed except through an external cause.” He takes this to entail that “Each thing, as far as it can by its own power, strives to persevere in its being” (E3p6). So, in Spinoza’s view, we have a purely metaphysical argument that it would be “contrary to Nature” for someone not to seek his own advantage. It would be contrary to Nature for anything not to seek its own advantage, insofar as it has the power to do so.

The second premise entails psychological egoism, for it entails that each person will seek his own advantage at all times. Spinoza’s argument for ethical egoism in this sense depends upon psychological egoism, and so it may seem reminiscent of Hobbes’ rationale for the similar conclusion that “of the voluntary acts of every man the object is some good to himself” (L I.xiv; p. 82). However, Hobbes reaches this view on the basis of his account of the psychology of voluntary acts: a voluntary act proceeds from the will, and a person’s will is just the last appetite that strikes him after a process of deliberation (L I.vi; p. 33). Since “whatsoever is the object of any man’s appetite…he for his part calleth good” (L I.vi; p. 28), Hobbes would agree with Spinoza that each person will seek what he considers to be his own advantage at all times. In spite of the similarity of their conclusions, Spinoza’s argument is grounded in the metaphysics of the conatus doctrine, while Hobbes’ argument is grounded in his psychological theory.

One of the philosophical problems with Spinoza’s version of ethical egoism has to do with whether, and to what extent, Spinoza’s view can really be a moral theory at all. Given the argument for the view, it is unclear how Spinoza can take the dictates of reason to be prescriptive. For example, according to Rutherford (2008), Spinoza treats the dictates of reason as adequate ideas that, when we possess them, cause us to act in ways that are conducive to our actual self-interest. If so, to follow the dictates of reason is just to be caused to behave in certain ways, which sits awkwardly alongside the thought that such dictates are prescriptive in any ordinary sense. This topic is the subject of ongoing scholarly inquiry—responses to the problem have been proposed by Kisner (2011, 118) and Steinberg (2014)—and it is closely related to the issue (flagged at the outset of this article) that Spinoza’s conception of ethics is in many ways quite different from our own.

i. The Greatest Good and the Inclination to Morality

For an egoist, the question as to what is good for an individual is crucial, for the answer to this question will determine what that individual ought, morally, to do. And Spinoza’s conception of the good is stereotypically egoistic: “By good I shall understand what we certainly know to be useful to us” (E4d1). Likewise, to be virtuous is simply to have and to exercise the power to do what is in our nature, and (as per the conatus doctrine) what is in your nature is to seek your own advantage as far as you are able (E4d8; 4p20). As a result, strength of character is also accounted for in self-interested terms.

Many passages in the Ethics make it appear that Spinoza simply thinks that what is best for each of us is the continuation of our lives. For example, he writes that “No one can desire to be blessed, to act well and to live well, unless at the same time he desires to be, to act, and to live, that is, to actually exist” (E4p21). Hence, the principle of seeking one’s own advantage and preserving one’s being is “the first and only foundation of virtue” (E4p22c), and obeying this principle is the only pursuit that is good for its own sake (E4p25). If this were so, then we might expect Spinozist morality to license all manner of violations of traditional morality in the name of self-preservation and the advancement of our own interests. Surprisingly, although he takes self-interest and self-preservation as the foundations of morality, Spinoza nevertheless holds that “The good which everyone who seeks virtue wants for himself, he also desires for other men” (E4p37). Although virtue is founded in rational self-interest, rational self-interest in turn urges us to desire the good of others.

To see why Spinoza thinks this, we need to understand this “good” that is desired by “everyone who seeks virtue.” The good in question, which is supposed to trump all other goods, is not actually our own lives, but what those lives are best spent in obtaining—the knowledge of God. Spinoza writes, “Knowledge of God is the mind’s greatest good; its greatest virtue is to know God” (E4p28). The argument for this is characteristically metaphysical, and again based upon the conatus doctrine. Spinoza argues that the “striving of the mind…is nothing but understanding,” and “cannot conceive anything to be good for itself except what leads to understanding” (E4p26d). Our innate desire to understand nature is, in his view, the very essence of our minds, and so this drive to understand also characterizes the good for us. Finally, “The greatest thing the mind can understand is God” (E4p28d), since ‘God’ signifies the whole of nature, so it follows that “the mind’s greatest advantage…is knowledge of God” (ibid).

Therefore, in Spinoza’s view, our greatest good is not the sort of thing that is subject to natural scarcity, nor need it be the object of competition. Rather, it is “common to all, and can be enjoyed by all equally” (E4p36). And because, in Spinoza’s view, other humans are more useful to us to the extent that they are rational (E4p35c1), it is entirely to our benefit when others pursue the same good—understanding—that we ourselves seek; for detailed exposition of Spinoza’s argument that it is to our benefit to pursue the good of others, see Della Rocca (2004, 125–8), Kisner (2009), and Grey (2013). This is why Spinoza thinks humans have a rational impetus to act in moral (that is, benevolent) ways toward others from a starting point of pure self-interest: “The desire to do good generated in us by our living according to the guidance of reason, I call morality” (E4p37s1).

ii. Spinoza’s Contractarianism

So far, Spinoza’s moral theory might not appear to be capable of answering the practical questions it is ordinarily hoped such a theory will answer. The conception of the good just outlined is so strikingly focused on human intellectual life that the resulting moral theory may seem far removed from ordinary moral matters. However, Spinoza has a bit more to say about morality beyond his claim that it is constituted by the pursuit of knowledge of God and the desire to do good for others. One important strand of Spinoza’s moral thought is a version of moral contractarianism, the view that we may become normatively bound to behave in certain ways on the basis of agreements or contracts we make when we live in society with others. His version of contractarianism is heavily influenced by Hobbes, from whom Spinoza appears to have drawn a number of key ideas. (This article deals only briefly with those aspects of Spinoza’s contractarianism that bear upon morality; see the article on Spinoza’s Political Philosophy for more information about this topic.)

It might seem surprising that Spinoza thinks humans need to live in society at all. Given that our greatest good is knowledge of God, ought we not all retreat to the mountaintop and spend our time in metaphysical inquiry? Spinoza’s reason for denying this is his pessimistic view of the prospects for humans overcoming all of their passions. Even the wisest philosopher requires assistance from her community in the pursuit of her greatest good. On this point, Spinoza disagrees with Descartes, who holds that “Even those who have the weakest souls could acquire absolute mastery over all their passions” (CSM I, 348). Spinoza’s view, by contrast, is that on account of the force of their passions, people “are often drawn in different directions and are contrary to one another, while they require one another’s aid” (E4p37s2, citations elided), and that these passions can never completely be overcome. Thus even the most wise and temperate among us has reason to enter a social contract. Because of our need for one another’s aid—whether to study philosophy or gain security—we have reason to live together with others in society. And because it is extremely difficult to moderate and restrain people’s worst passions, we cannot enjoy the benefits of civil society without entering a social contract.

With this observation in the background, the argument for moral contractarianism appears in a very abbreviated form in a scholium in the Ethics:

In order, therefore, that men may be able to live harmoniously and be of assistance to one another, it is necessary for them to give up their natural right and to make one another confident that they will do nothing which could harm others… By this law, therefore, society can be maintained, provided it appropriates to itself the right everyone has of avenging himself, and of judging good and evil. (E4p37s2, G II/237–8)

The argument is one commonly associated with classical social contract theories. Because humans are unable to live peacefully with one another so long as they retain their natural right to act as they please, it is in each person’s best interest to give up that right to the state, on the condition that everyone else does the same.

For this reason, Spinoza holds the prima facie surprising view that laws are morally binding on us even in cases in which those laws are not rational. In conflicts between the laws of our society and the dictates of our reason, the laws win out. Likewise, although in the context of his metaphysics, Spinoza treats evil and sin as functions of an individual’s power; when he is writing about such things in the context of civil society, he provides a very different picture. For example, he writes, “[E]veryone is bound to submit to the state. Sin, therefore, is nothing but disobedience…” (E4p37s2, G II/238); “A wrong occurs when a citizen or subject is forced to suffer some injury at the hands of another…contrary to the edict of the sovereign power” (TTP Ch. 16, 179). Why does law figure so prominently in discussions of morality in the context of civil society? In his Theological-Political Treatise, where he develops these ideas at length, Spinoza argues, “it is our duty [tenemur] to carry out all the orders of the sovereign power without exception, even if those orders are quite irrational. For reason bids us carry out even such orders so as to choose the lesser of two evils” (TTP Ch. 16, 177). The argument is that even if we recognize what is required by law to be irrational, it cannot be as irrational as it would be to violate the law, and thereby to become “enemies of the state and to act against reason which urges us to uphold the state with all our might” (ibid).

iii. Spinoza’s Virtue Theory and the “Free Man”

Another way in which Spinoza attempts to make his moral theory easier to put into practice is by providing a virtue theory based on it. Spinoza spends the latter sections of part of the Ethics developing a virtue theory of a fairly traditional sort, outlining which character traits and behaviors are virtues, and which are vices, in the conception of morality he has developed. He concludes this part of the work with some claims “concerning the free man’s temperament and manner of living,” where the “free man” is understood to be someone who lives wholly according to the guidance of reason. Since the very idea of a human being who lives wholly according to the guidance of reason is apparently contradictory—Spinoza has earlier observed that “man is necessarily always subject to passions” (E4p4c)—the discussion of the free man is not properly understood as describing an attainable goal. However, many scholars (such as Garrett [1990, 229–30] and Nadler [2006, 219]) take this discussion of the free man to be Spinoza’s presentation of the model of human nature he promised in the preface to Ethics 4. If so, then the description of the free man may best be seen as a guiding ideal, a character that ordinary people should aspire to be like, at least insofar as they are able.

Spinoza’s description of the free man’s way of living is based upon his account of virtues: if a character trait is grounded in our reason and our pursuit of understanding, it is a virtue; if it is grounded in our passions or ignorance, it is a vice. These considerations are clearly rooted in his conception of our greatest good (as outlined above). Although Spinoza’s treatment of many of the virtues is in keeping with traditional conceptions of virtue, he often parts ways with these traditional conceptions. For example, his conclusion that tenacity and nobility are virtues is in keeping with tradition. (Why are they virtues? Tenacity, he says, is the character trait corresponding to our rational striving for self-preservation, and nobility is the character trait corresponding to our rational striving for the benefit of others [E3p59s]. So both character traits are grounded in reason, not the passions.) However, Spinoza also argues that humility, repentance, and pity—character traits highly esteemed by traditional religious authorities—are not virtues, for they are “useless” and “do not arise from reason” (E4p50, 53, and 54). In his view, these character traits are not really virtues even if they do occasionally cause us to pursue the good, for they are only accidentally connected to the pursuit of the good. Reason, by contrast, is essentially connected to the pursuit of the good. As a result, anything good that we might be led to do out of pity (for instance), we could just as well have been led to do by reason. Being guided by pity, then, can be no better than being guided by reason. Moreover, pity always involves sadness, a form of disempowerment, so considered in itself, it is evil. Hence being guided by pity is inevitably worse than being guided by reason: “a man who lives according to the dictate of reason strives, as far as he can, not to be touched by pity” (E4p50c).

When Spinoza characterizes the “free man,” someone who lives wholly according to the guidance of reason, we should therefore expect only partial continuity with traditional conceptions of morality and virtuous living. The free man, Spinoza reasons, will pick his battles wisely, showing his virtue both in avoiding danger and in overcoming it (E4p69). He will always act honestly (E4p72). And he will seek to live in society with others rather than in solitude (E4p73). Nevertheless, the free man will graciously decline favors or gifts from those who do not follow the guidance of reason and who are ruled by their emotions (E4p70). Accepting such favors or gifts is liable to be dangerous, for the irrational gift-giver will inevitably value them more highly than the free man; the free man reserves his gratitude for the friendship of other rational people (E4p71), insofar as such friendship aids him in his pursuit of greater understanding. In practice no actual human could live exactly as the free man does, for (as mentioned in part one above) only a substance can be fully rational and active, and humans are not substances. Nevertheless Spinoza’s presentation of these claims suggests that he takes them to be desirable ways of living, because they derive from “strength of character, that is, [from] tenacity and nobility” (E4p73), the primary virtues.

c. Applications of Spinoza’s Moral Theory

In the course of developing his moral theory, Spinoza sometimes applies it in passing to what he recognizes are traditional moral problems. He is often somewhat dismissive of many of these traditional moral problems, and his treatment of them rarely includes the sort of depth they receive in works of applied moral philosophy. However, his responses to such problems are often interesting because, given the demands of other parts of his philosophical system, his proposals are often surprising and idiosyncratic. This article discusses four of them: the moral permissibility of suicide, of lying, and of causing harm to animals or to the environment.

i. Suicide and Self-Harm

One traditional moral problem regards the moral permissibility of self-harm, the ultimate case of which is suicide. Spinoza does not agree with most of the traditional religious reasons for treating suicide as a sin. For example, an explanation of the wrongness of suicide common in the Judeo-Christian religious traditions appeals to one of the Ten Commandments: “Thou Shalt Not Kill.” According to this family of explanations, suicide is a sin because it involves taking a human life, which God has commanded humans not to do. Spinoza takes the conception of God upon which this explanation relies to be false: many imagine “God as a ruler, lawgiver, king, merciful, just and so forth; whereas these are all merely attributes of human nature, and not at all applicable to the divine nature” (TTP Ch. 5, 53). God simply does not issue commandments in the way that a king issues commandments. Given this fact, Spinoza thinks, it makes little sense to try to explain moral claims like “Suicide is a sin” by appeal to such commandments.

Although he disagrees with traditional reasons for taking suicide to be immoral, he nevertheless agrees that suicide is in fact immoral. On this point, Spinoza is very clear: someone who commits suicide is “weak-minded and completely conquered by external causes contrary to their nature” (E4p18s). This conclusion is primarily a result of the conatus doctrine, since that doctrine forces Spinoza to deny that anyone can kill himself, strictly speaking. There must always be external causes that can be assigned to explain suicide or self-harm. But that is merely a descriptive claim; the evaluative claim that it is a “weak-minded” act derives from Spinoza’s ethical egoism. To be virtuous is to strive to preserve one’s being, so suicide is as far from virtue as one can go, in Spinoza’s view.

ii. Lying and Deceit

In his characterization of the “free man” at the end of part of the Ethics, Spinoza argues that a perfectly rational being “always acts honestly, not deceptively” (E4p72). The argument for this, on the face of it, anticipates Kant’s famous argument for the same conclusion. Spinoza reasons that if a perfectly rational being acted deceptively, he would do so “from the dictate of reason” (because, presumably, that is how a perfectly rational being does anything); but then it would be rational to act in that way, and “men would be better advised to agree only in words, and be contrary to one another in fact” (E4p72d). Spinoza takes this consequence to be absurd, for it is in our interest to bring others into as much agreement with our natures as possible (E4p31c), which living deceitfully would prevent.

One puzzle that this argument raises is the apparent conflict between Spinoza’s claim that a perfectly rational being would always act honestly and his claim that such a being would never do anything that brought about its own destruction. Spinoza does not explicitly attempt to resolve this problem in the Ethics, though commentators have attempted to do so on his behalf in a variety of ways (Garrett 1990, 228–33).

iii. Animal Ethics

As should not be surprising given his ethical egoism, Spinoza is not sympathetic to the thought that we ought to worry ourselves about either our treatment of animals or of the environment. With respect to animals, Spinoza writes, “the law against killing animals is based more on empty superstition and unmanly compassion than sound reason” (E4p37s1). Reason dictates that we seek out the companionship of other humans because they share our nature, and what is good for us is good for them. However, since non-human animals differ in nature from us, reason dictates that we “consider our own advantage, use them at our pleasure, and treat them as is most convenient for us” (ibid). So, in spite of the fact that Spinoza does not view humans as metaphysically privileged—for instance, he disagrees with the Cartesian view that humans, but not other animals, have minds (ibid)—he nevertheless holds that we need not concern ourselves with the welfare of non-human animals. There may be situations in which our own welfare depends upon the welfare of a non-human animal, as when a farmer’s livelihood depends upon the welfare of his stock. But only in such situations will a human have reason to care about the welfare of a non-human. That said, it is not clear that this is the view he ought to have adopted, given his first principles (Grey 2013, 378–382).

iv. Environmental Ethics

With respect to the environment, matters are less clear-cut. Spinoza does acknowledge that humans are by their nature dependent upon their environment:

It is the part of a wise man, I say, to refresh and restore himself in moderation with pleasant food and drink, with scents, with the beauty of green plants, with decoration, music, sports, the theater, and other things of this kind, which anyone can use without injury to another. For the human Body is composed of a great many parts of different natures, which constantly require new and varied nourishment… (E4p45s)

Unfortunately, after this picturesque passage, Spinoza does not go on to consider what our dependence upon our environment might entail with regard to our treatment of it. Much of our concern regarding environmental ethics today is based on our recognition that the environment is not an inexhaustible source of nourishment and wealth; to a seventeenth-century author, this possibility would have seemed bizarre.

That being said, Spinoza’s views about animal ethics can be applied more or less directly to the environment as well. It would be irrational to work to preserve the environment for its own sake, since what is good for the environment is not necessarily good for us. However, insofar as we are concerned for the well-being of ourselves and other humans, and we recognize that well-being to depend upon the environment, it will be rational for us to preserve the environment—not for its sake, but for ours. This thought is at least hinted at in the quoted passage, where Spinoza notes that we are to “refresh and restore” ourselves only using means that “anyone can use without injury to another.” Insofar as the production of our “pleasant food and drink” turns out to cause injury to the environment upon which our neighbors (or we ourselves) depend, the practice would be open to moral criticism.

Some, such as Naess (1977), have gone further than this, arguing that Spinoza’s system provides a hospitable metaphysical background for ecology. However, as Kober (2013, 58–9) notes, one of the consequences of Spinoza’s views is that important conceptual tools of ecology lose their purchase. For example, Spinoza allows no distinction between what is natural and what is artificial. And, more importantly, there is no sense to be made of the designation of certain types of human activities as exploitative of the environment or of animals.

3. Spinoza’s Remedies for the Passions

In the 17^th century, moral philosophy was not yet primarily preoccupied with either accounting for the nature and origins of morality or with establishing general principles governing moral obligation—though, as we have seen, Spinoza does develop some views on these topics en route to the final part of the Ethics. Rather, in this period, one of the central aims of moral philosophy was to provide the reader with psychological tools that could be used to cultivate desirable states of being. For this reason, seventeenth-century texts on moral philosophy tend to be more akin to self-help books than to twenty-first century moral philosophy. The first half of Ethics V exemplifies this tendency. There, Spinoza attempts to provide a guide to how to train our minds in order to “bring it about that we are not easily affected with evil affects” (E5p10s).

‘Passion’ [passio] is a technical term for which Spinoza provides a careful definition. He writes, “An affect which is called a passion of the mind is a confused idea, by which the mind affirms of its body, or of some part of it, a greater or lesser force of existing than before, which, when it is given, determines the mind to think of this rather than that” (EIII Gen. Def. of Aff., G II/203–4). This definition connects the passions to his theory of ideas, since all passions are confused ideas. It also connects the passions to the conatus doctrine: the passions represent changes in the body’s “force of existing” [existendi vim], and this force of existing is presumably the same force introduced in his discussion of the innate striving of all things to persevere in existing (see section 1 above).

Spinoza appeals to both of these pieces of theoretical machinery, along with a few interesting additions, when he presents his five remedies for overcoming or restraining the passions. It is worth noting that although the view that we should strive to diminish the strength of our emotions has a very Stoic ring to it, he expressly distances himself from the Stoics. His reason for this is their belief “that the emotions depended absolutely on our will, and that we could absolutely govern them” (E5pref), which Spinoza thinks involves a misunderstanding of the structure and powers of the human mind. This comes out in his remedies for the passions: of the five remedies, only two (the first and fifth) are plausibly activities that we can perform intentionally.

a. Via Knowledge of the Affects

Spinoza claims that whenever we “form a clear and distinct idea” of a passion, it will no longer be a passion (E5p3). Since all passions are confused ideas—indeed, this is a core component of the definition of a passion—the most straightforward way to eliminate a passion is to eliminate the confusion that is the basis for that passion. In Spinoza’s view, the idea of an idea is not really distinct from the idea itself (E2p21s), so the clear and distinct idea we form of a passionate affect is not really distinct from that affect. But, since the clear and distinct idea is not confused, to conceive of it in this way is to eliminate the confusion from the original passion. Once we have eliminated this confusion, “the affect will cease to be a passion” (ibid). This approach to overcoming a passion does not eliminate the affect that constitutes the passion, but merely eliminates that feature of the affect in virtue of which it constituted a passion. The confusion a passionate affect involves is not intrinsic to that affect, in Spinoza’s view, and when that confusion is stripped away, the affect nevertheless remains.

Spinoza does not say much to clarify how this procedure is supposed to work. However, in at least one of Spinoza’s accounts of confusion, to say that an idea is confused is to say that it is partly determined by external causes (E2p29s). Thus, to strip away the confusion from a passion would require one somehow to strip away some of its causes. But that possibility appears to be inconsistent with Spinoza’s conception of causation, according to which an effect must be understood through its causes (Lin [2009, 270]; Bennett [1984, 336]). Scholars remain divided as to whether this difficulty, commonly referred to as the Changing Problem, is surmountable; see Marshall (2012) for some proposed solutions on Spinoza’s behalf.

b. Via Removing the Idea of an External Cause

All inadequate ideas have external causes (E3p1), so all passions are guaranteed to have external causes as well. In some cases, a passion not only has an external cause, but is such that it represents that cause (or purported cause). For example, love is joy accompanied by the idea of an external cause of that joy (E3 Def. of Affs. VI, G II/192). That is, the passion of love is a composite idea, and its parts are (i) joy, and (ii) the representation of something external as producing that joy. In such cases, we can destroy the passion by mentally separating the idea of the external cause that it includes. As Spinoza puts it: “For what constitutes the form of [such passions] is joy, or sadness, accompanied by an external cause… So if this is taken away, the form of love or hate is taken away at the same time. Hence, these affects, and those arising from them, are destroyed” (E5p2d).

c. Via the Endurance of Rational Affects

Spinoza’s third remedy for overcoming the passions is less a method than an observation about a natural consequence of our emotional psychology. One factor that determines the force with which an emotion strikes us is whether we conceive of its cause as present. For instance, Spinoza writes, “An affect whose cause we imagine to be with us in the present is stronger than if we did not imagine it to be with us” (E4p9). Examples of this phenomenon are abundant. Whether snakes are present or absent, Yetta fears them. However, if she thinks snakes are present, that fact serves to fuel her fear; and if she thinks them absent, her fear is greatly diminished. Affects that are produced by ordinary external objects—fear of snakes, love for one’s car, desire for pie, and so forth—all naturally vacillate in force over time based on whether we take their objects to be present or absent.

By contrast, affects “arising from or aroused by reason” (E5p7) have a very different profile. The object of such an affect is “necessarily related to the common properties of things” (E5p7d), which are pervasive features of reality, such as the property of being extended. In Spinoza’s view, “we always regard [such properties] as present,” and we “always imagine [them] in the same way” (ibid). So, such an affect will endure over a longer period of time, and with a more constant degree of force, than affects produced by external things. In the long run, Spinoza thinks, irrational affects will be forced to “accommodate themselves” more and more frequently to the rational affects. In this way, we will naturally tend over time toward rational affects and away from irrational ones. Spinoza’s line of argument here is thus aimed at defending the consoling thought that reason will tend to win out, rather than at providing a technique we can apply to help reason win out.

d. Via the Multiplicity of the Causes of Rational Affects

Recall from section 2 that Spinoza takes the greatest good for all humans to be knowledge of God. Fortunately, the idea of God is one that we “really fully possess” (E5p20s, G II/294; cf. E2p45), and so our greatest good can be realized. Indeed, since everything in nature is a mode of God, in Spinoza’s view, the skilled philosopher can revive and meditate upon the idea of God on the basis of any experience whatsoever; every experience can occasion a train of thought that leads the mind back to its greatest good, and the joy that it brings. But these facts suggest a fourth way in which we may diminish the force of our passions, namely by means of “the multiplicity of causes by which affections related to common properties or to God are encouraged” (E5p20s, G II/293).

As with the third method, Spinoza here has in mind the comparative force of rational affections over irrational ones. While the third remedy appeals to Spinoza’s view that the objects of rational affections are constant and unchanging, the fourth remedy appeals to his view that the causes of rational affections are universal and omnipresent. This is relevant because Spinoza holds that

[A]s an image, or affect, is related to more things, there are more causes by which it can be aroused and encouraged, all of which the mind…considers together as a result of the affect itself. And so the affect is the more frequent, or flourishes more often, and engages the mind more. (E5p11d)

This is another way in which rational affects gradually become stronger and eventually may overpower the passionate affects. Passionate affects may be very strong for as long as their cause is present, but rational affects—in particular, the desire for knowledge and the love of God—have innumerably more and greater causes, and so rational affects will “flourish more often, and engage the mind more” than passionate ones (ibid).

e. Via the Re-Ordering of the Affects

The final remedy Spinoza offers is unlike the previous two in that it is an activity that we can intentionally perform to diminish the force of our passions. It is based upon the power that he believes the human mind has to intentionally join two ideas to one another by frequently thinking about them in unison, so that when the first idea occurs, the second idea is naturally aroused in the mind as well. One of the ways in which we may apply this power is by intentionally joining passionate affects together with mottos or rules, “sure maxims of life,” that are rational to follow whenever those passions take hold of us (E5p10s, G II/287).

Spinoza uses several examples to flesh out how this remedy is supposed to work; the main example he uses is the maxim “that hate is to be conquered by love, or nobility, not by repaying it with hate in return” (ibid). He writes,

[W]e ought to think about and meditate frequently on the common wrongs of men, and how they may be warded off best by nobility. For if we join the image of a wrong to the imagination of this maxim, it will always be ready for us…when wrong is done to us. (E5p10s, G II/288)

We originally determine that nobility is a virtue by means of rational inquiry. However, we are not best served by attempting to recreate the chain of reasoning that would lead us to act nobly when someone insults or harms us, but rather by having that maxim firmly committed to memory. Spinoza is admitting that in the heat of the moment, we are unlikely to be able to simply reason our way out of passion. But by means of carefully arranging the thoughts our passions are associated with in advance, we can ensure that “the wrong, or hate usually arising from [another’s wronging us], will occupy a very small part of the imagination, and will be easily overcome” (ibid).

In this way, a person may intentionally use irrational processes (memory and imagination) to safeguard his ability to act rationally: “he who will observe these [rules] carefully…and practice them, will soon be able to direct most of his actions according to the command of reason” (E5p10s, G II/289). By training ourselves to react in ways that, in our calmer, dispassionate moments, we recognize to be rational, we will be prepared to respond appropriately even when we lack time for reflection. This appears to connect to Spinoza’s claim in the preface to Ethics 4 that we ought to cultivate and hold before ourselves an idealized human being whom we can model our own behavior upon (discussed in section 2.3). Based on passages such as this, scholarship on Spinoza’s ethical theory has tended to depart from the traditional picture of imagination as something to be transcended through the use of reason; see, for example, Soyarslan (2014, 243–7), Steinberg (2014, 187–192) and James (2014, 154–159). Although Spinoza may rightly be called a rationalist in a number of senses, his account of how we achieve “freedom of mind, or blessedness” (E5pref) appears to depend as much on non-rational powers of imagination and memory as it does on reason.

4. Conclusion

In Spinoza’s view, human moral judgments are grounded in human desires or beliefs. However, in spite of this anti-realist metaethics, Spinoza endorses an intellectualist version of ethical egoism: reason dictates that we seek our greatest good, and this greatest good is understanding. He further tempers his ethical egoism by endorsing a version of contractarianism, according to which we may be bound to obey laws even when we recognize them to be irrational, and they seem to hinder our efforts to seek our greatest good, since the alternative (living without the help of civil society) will always be far worse. Finally, to aid us in the pursuit of understanding, which is often hindered by our passions, Spinoza provides a series of “remedies” by which the force of the passions may be mitigated.

Thus, in spite of the fact that Spinoza initially appears to have no interest in our contemporary notion of moral philosophy, the moral theory he develops has a surprising degree of depth and nuance. Indeed, since he builds his account of morality on top of a thoroughly naturalistic conception of the world, and of humanity’s place in it—and since our desire not to be mastered by our passions remains as strong today as it was in the 17^th century—Spinoza’s moral philosophy remains alive for us today.

5. References and Further Reading

a. Primary Sources

Passages from Spinoza’s Ethics are cited in the usual way. For example, ‘E1p25’ refers to Ethics part 1 proposition 25; ‘E1p25d’ refers to the demonstration of that proposition; ‘E1p25s’ to its scholium; and ‘E1p125c’ to its corollary. Reference to the Gebhardt edition page numbers is provided where the usual citation would refer to a span of more than one page.

Descartes, Rene. The Philosophical Writings of Descartes [vols. I–II], eds. J. Cottingham, R. Stoothoff, and D. Murdoch. (Cambridge: Cambridge University Press, 1985). [CSM]
Hobbes, Thomas. Leviathan, with selected variants from the Latin edition of 1668, ed. E. Curley. (Indianapolis: Hackett, 1994). [L]
Spinoza, Benedict de. Opera, ed. C. Gebhardt. (Heidelberg: Carl Winters Universitätsverlag, 1925). [G]
Spinoza, Benedict de. The Collected Works of Spinoza, ed. and trans. E. Curley. (Princeton: Princeton University Press, 1988). [E]
Spinoza, Benedict de. The Letters, ed. and trans. S. Shirley. (Indianapolis: Hackett Publishing, 1995). [Ep.]
Spinoza, Benedict de. Theological-Political Treatise, ed. and trans. S. Shirley. (Indianapolis: Hackett, 1998). [TTP]

b. Secondary Sources

Bennett, Jonathan. A Study of Spinoza’s Ethics. (Indianapolis: Hackett, 1984).
Curley, Edwin. Behind the Geometrical Method. (Princeton: Princeton University Press, 1988).
Curley, Edwin. “Spinoza’s Moral Philosophy.” In Spinoza: A Collection of Critical Essays, ed. M. Grene (Notre Dame: University of Notre Dame Press, 1979), 354–376.
Della Rocca, Michael. “Egoism and the Imitation of the Affects in Spinoza.” In Spinoza on Reason and the ‘Free Man’: The Jerusalem Conference [vol. 4], eds. Y. Yovel and G. Segal. (New York: Little Room Press, 2004), 123–147.
Garrett, Don. “Spinoza’s ethical theory.” In Cambridge Companion to Spinoza, ed. D. Garrett. (Cambridge: Cambridge University Press, 1996), 267–314.
Garrett, Don. “‘A Free Man Always Acts Honestly, Not Deceptively’: Freedom and the Good in Spinoza’s Ethics.” In Spinoza: Issues and Directions, eds. E. Curley and P. F. Moreau (Leiden: Brill, 1990), 221–38.
Grey, John. “Spinoza on Composition, Causation, and the Mind’s Eternity.” British Journal for the History of Philosophy 22(3), 2014: 446–467.
Grey, John. “‘Use Them At Our Pleasure’: Spinoza on Animal Ethics.” History of Philosophy Quarterly 30(4), 2013: 367–388.
Hübner, Karolina. “Spinoza on Being Human and Human Perfection.” In Essays on Spinoza’s Ethical Theory, eds. M. Kisner and A. Youpa. (Oxford: OUP, 2014), 124–142.
James, Susan. “Spinoza, the Body, and the Good Life.” In Essays on Spinoza’s Ethical Theory, eds. M. Kisner and A. Youpa. (Oxford: OUP, 2014), 143–159.
Kisner, Matthew. Spinoza on Human Freedom: Reason, Autonomy, and the Good Life. (Cambridge: Cambridge University Press, 2011).
Kisner, Matthew. “Spinoza’s Benevolence: The Rational Basis of Acting for the Benefit of Others.” Journal of the History of Philosophy 47(4), 2009: 549–567.
Kisner, Matthew and Andrew Youpa (eds.). Essays on Spinoza’s Ethical Theory. (Oxford: OUP, 2014).
Kober, Gal. “For They Do Not Agree In Nature: Spinoza and Deep Ecology.” Ethics and the Environment 18(1), 2013: 43–65.
Lebuffe, Michael. From Bondage to Freedom: Spinoza on Human Excellence. (Oxford: OUP, 2010).
Lebuffe, Michael. “Spinoza’s Normative Ethics.” Canadian Journal of Philosophy 37(3), 2007: 371–392.
Lin, Martin. “The Power of Reason in Spinoza.” In Cambridge Companion to Spinoza’s Ethics, ed. O. Koistinen. (Cambridge: Cambridge University Press, 2009), 258–283.
Marshall, Colin. “Spinoza on Destroying Passions with Reason.” Philosophy and Phenomenological Research 85(1), 2012: 139–160.
Melamed, Yitzhak. “Spinoza’s Anti-Humanism: An Outline.” In The Rationalists: Between Tradition and Innovation, eds. C. Fraenkel, D. Perinetti, and J. E. H. Smith. (Boston: Kluwer, 2011), 147–166.
Nadler, Steven. Spinoza’s Ethics: An Introduction. (Cambridge: Cambridge University Press, 2006).
Naess, Arne. “Spinoza and Ecology.” Philosophia 7, 1977: 45–54.
Rutherford, Donald. “Spinoza and the Dictates of Reason.” Inquiry 51(5), 2008: 485–511.
Soyarslan, Sanem. “From Ordinary Life to Blessedness.” In Essays on Spinoza’s Ethical Theory, eds. M. Kisner and A. Youpa. (Oxford: OUP, 2014), 236–257.
Steinberg, Justin. “Following a Recta Ratio Vivendi: The Practical Utility of Spinoza’s Dictates of Reason.” In Essays on Spinoza’s Ethical Theory, eds. M. Kisner and A. Youpa. (Oxford: OUP, 2014), 178–196.
Verbeek, Theo. Spinoza’s Theologico-Political Treatise: Exploring ‘The Will of God’. (Burlington, VT: Ashgate, 2003).
Wilson, Catherine. “The Strange Hybridity of Spinoza’s Ethics.” In Early Modern Philosophy, eds. C. Mercer and E. O’Neill. (Oxford: OUP, 2005), 86–99.
Youpa, Andrew. “Spinoza’s Theories of Value.” British Journal for the History of Philosophy 18(2), 2010: 209–229.
Youpa, Andrew. “Spinoza’s Theory of the Good.” In Cambridge Companion to Spinoza’s Ethics, ed. O. Koistinen. (Cambridge: Cambridge University Press, 2009), 242–57.

Author Information

John Grey
Email: jrtgrey@gmail.com
Michigan State University
U. S. A.

The Upaniṣads

The Upaniṣads are ancient texts from India that were composed orally in Sanskrit between about 700 B.C.E. and 300 B.C.E. There are thirteen major Upaniṣads, many of which were likely composed by multiple authors and are comprised of a variety of styles. As part of a larger group of texts, known as the Vedas, the Upaniṣads were composed in a ritual context, yet they mark the beginning of a reasoned enquiry into a number of perennial philosophical questions concerning the nature of being, the nature of the self, the foundation of life, what happens to the self at the time of death, the good life, and ways of interacting with others. As such, the Upaniṣads are often considered to be the fountainhead of the subsequent rich and varied philosophical tradition in India. The Upaniṣads contain some of the oldest discussions about key philosophical terms such as ātman (the self), brahman (ultimate reality), karma, and yoga, as well as saṃsāra (worldly existence), mokṣa (enlightenment), puruṣa (person), and prakṛti (nature)—all of which would continue to be central to the philosophical vocabulary of later traditions. In addition to contributing to the development of a discursive language, the Upaniṣads further frame later philosophical debates by their exploration of a number of means of attaining knowledge, including deduction, comparison, introspection, and debate.

The Upaniṣads and the Vedas
1. Main Upaniṣads
2. Minor Upaniṣads
From Ritual to Philosophy
The Self
Ātman and Brahman
Karma, Saṃsāra, and Mokṣa
Ethics and the Upaniṣads
The Upaniṣads and Hindu Darśanas before Vedānta
The Upaniṣads and Vedānta
The Upaniṣads as Philosophy
The Upaniṣads in the Modern Period
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. The Upaniṣads and the Vedas

a. Main Upaniṣads

The Upaniṣads are the fourth and final section of a larger group of texts called the Vedas. There are four different collections of Vedic texts, the Ṛgveda, Yajurveda, Sāmaveda, and Atharvaveda, with each of these collections containing four different layers of textual material: the Saṃhitās, Brāhmaṇas, Āraṇyakas, and Upaniṣads. Although each of these textual layers has a variety of orientations, the Saṃhitās are known to be largely comprised of hymns praising gods and the Brāhmaṇas are mostly concerned with describing and explaining Vedic rituals. The Āraṇyakas and Upaniṣads are also firmly rooted in ritual, but with both groups of texts there is an increasing emphasis on understanding the meaning of ritual, while some sections of the Upaniṣads seem to move completely away from the ritual setting into naturalistic and philosophical inquiry about the processes of life and death, the workings of the body, and the nature of reality.

The Vedic Upaniṣads are widely recognized as being composed during two chronological stages. The texts of the first period, which would include the Bṛhadāraṇyaka (BU), Chāndogya (CU), Taittirīya (TU), Aitareya (AU) and Kauṣītakī (KsU), are generally dated between 700 and 500 B.C.E., and are considered to predate the emergence of the so-called heterodox traditions, such as the Buddhists, Jains, and Ājīvikas. Scholarly consensus dates the second stage of Vedic Upaniṣads, which includes the Kena (KeU), Kaṭha (KaU), Īśā (IU), Śvetāśvatara (SU), Praśna (PU), Muṇḍaka (MuU), Māṇḍūkya (MaU), and Maitrī (MtU), between 300-100 B.C.E. (Olivelle 1998: 12-13). The older Upaniṣads are primarily composed in prose, while the later ones tend to be in metrical form, but any individual text may contain a diversity of compositional styles. Additionally, many individual Upaniṣads consist of various types of material, including creation myths, interpretations of ritual actions, lineages of teachers and students, magical formulae, procreation rites, and narratives and dialogues about famous teachers, students, and kings.

The so-called Hindu darśanas—Nyāya, Vaiśeṣika, Mīmāṃsā, and Vedānta—do not adhere to the chronology above, as they regard all the Vedic Upaniṣads as śruti, meaning a timeless revealed knowledge. The remaining two Hindu darśanas—Sāṃkhya and Yoga—are usually read as supporting the Vedas. However, when tracing the historical development of philosophical ideas, it is helpful to note some differences in orientation between the two stages of Upanishadic material. While all the Upaniṣads devote considerable attention to topics such as the self (ātman) and ultimate reality (brahman), as well as assume some version of the karma doctrine, the earlier texts tend to characterize ultimate reality in abstract and impersonal ways, while the later Upaniṣads, particularly the Īśā and Śvetāśvatara, are more theistic in orientation. Meanwhile, the later Upaniṣads explicitly address a number of key topics such as yoga, mokṣa, and saṃsāra, all of which would continue to be central aspects of subsequent Indian philosophy.

b. Minor Upaniṣads

In addition to those affiliated with the Vedas, there are literally hundreds of other texts bearing the name “Upaniṣad.” These texts have been grouped together by scholars according to common themes, such as the Yoga Upaniṣads (Upaniṣads on Yoga), the Saṃnyāsa Upaniṣads (Upaniṣads on Renunciation), the Śaiva Upaniṣads (Upaniṣads on the Hindu God Śiva), and the Vaiṣṇava Upaniṣads (Upaniṣads on the Hindu God Viṣṇu) (see Deussen 1980 and Olivelle 1992). The majority of these texts were composed between the 2nd and 15th centuries CE, although texts referred to as “upaniṣad” have continued to be composed up to the present day. Many of the post-Vedic Upaniṣads further develop core concepts from the Vedic Upaniṣads, such as ātman, brahman, karma, and mokṣa. In addition to a shared conceptual world, the post-Vedic Upaniṣads often quote extensively from the earlier texts and feature many of the same teachers and students, such as Yājñavalkya, Janaka, and Śaunaka.

2. From Ritual to Philosophy

Despite their significant contribution to subsequent Indian philosophical traditions, there has been disagreement about whether or not the Upaniṣads themselves constitute philosophy. Much of this debate depends, of course, on how one defines philosophy. A recurring argument as to why the Upaniṣads might not be considered philosophy is because they do not contain a unified or a systematic position. This, however, largely reflects the composite and fragmented nature of the texts. Rather than being characterized as unsystematic, the diversity of teachings can be better understood when considering the fact that different texts were composed within the context of separate and often competing scholarly traditions or schools (śākhas). Accordingly, the Upaniṣads do not have a unified philosophical system, but rather contain a number of overlapping themes and mutual interests. Nonetheless, there can be considerable uniformity within a particular text or within a group of texts ascribed to the same school, and even more so according to the lessons ascribed to any particular teacher. In addition to the distinct philosophical agendas of different texts, we see different teachers articulate their teachings within the context of competition over recruiting students, securing patronage, and debating with rivals in public contests. With this context in mind, it is not surprising to find various, sometimes conflicting, teachings throughout the texts.

Due to their connection with previous Vedic material, the Upaniṣads generally assume a ritual context, containing many passages that explain the significance of ritual actions or interpret mantras (sacred verses) uttered during the ritual. One of the most prevalent tendencies to continue from the ritual texts is an attempt to identify the underlying connections (bandhus) that exist among different orders of reality. Often these connections were made among three spheres: the cosmos, the body of the sponsor of the ritual (yajamāna), and the ritual grounds—in other words, between the macrocosm, the microcosm, and the ritual. An illustrative example appears at the beginning of the Bṛhadāraṇyaka Upaniṣad, where the different body parts of the horse in the sacrifice (aśvamedha) are compared to the different elements, regions, and intervals of time in the cosmos (BU 1.1). The implication is that by reflecting on the relational composition of the horse, one can understand the structure of the universe.

There have been some debates regarding the meaning of the word “upaniṣad,” with the components of the word (upa + ni + sad) suggesting texts that were to be learned ‘sitting down near’ one’s teacher. However, the word is not employed in this way in the texts, nor in existing commentaries. Rather, in its earliest textual contexts, the word “upaniṣad” takes on a meaning similar to bandhu, describing a connection between things, often presented in a hierarchical relationship. In these contexts, upaniṣad is often interpreted as the most essential or most fundamental connection. Moreover, “upaniṣad” designates equivalences between components of different realms of reality that were not considered to be observable by the senses, but remained concealed and obscured, and required special knowledge or understanding. On several occasions, “upaniṣad” means ‘secret teaching’ (that is, CU 1.1.10; 1.13.4; 8.8.4; 4.2.1; 5.5.3-4), a notion that is reinforced by the use of other formulations such as guhyā ādeśā (‘hidden instruction’; BU 3.5.2) and para guhya (‘supreme secret’; KaU 3.17; SU 6.22). In the Bṛhadāraṇyaka Upaniṣad, the word “upaniṣad” is equated with the formulation satyasya satyam (BU 2.2.20)—‘the truth behind the truth’— an expression suggesting that an upaniṣad is a truth or reality beyond that which appears to be true.

Whether discussing the essence of life or the source of a king’s power, the Upaniṣads show an interest in establishing a firm foundation or an ontological grounding for different aspects of reality, and ultimately, for reality as a whole. One of the terms most associated with these discussions is brahman. The oldest usages of the word are closely connected with the power of speech, with brahman meaning a truthful utterance or powerful statement. In the Upaniṣads, brahman retains this connection with speech, but also comes to refer to the underlying reality or the ontological foundation. In some passages brahman is associated with truth (TU 1.1), while on other occasions its is linked with immortality (CU 2.23.1) or characterized as a heavenly abode (BU 4.4.7-8).

3. The Self

One of the most widely discussed topics throughout both the early and late Upaniṣads is the self (ātman). The word “ātman” is a reflexive pronoun, likely derived from √an (to breathe). Even in the Ṛgveda (c.1200 B.C.E.), the earliest textual source from ancient India, ātman had already a wide range of lexical meanings, including ‘breath’, ‘spirit’, and ‘body’. By the time of the Upaniṣads, the word was used in a variety of ways, sometimes referring to the material body, but often designating something like an essence, a life-force, consciousness, or ultimate reality.

One of the most well known teachings of ātman appears in the Chāndogya Upaniṣad (6.1-16), as the instruction of the brahmin Uddālaka Āruṇi to his son Śvetaketu. Uddālaka begins by explaining that one can know the universal of a material substance from a particular object made of that substance: by means of something made of clay, one can know clay; by means of an ornament made of copper, one can know copper; by means of a nail cutter made of iron, one can know iron. Uddālaka uses these examples to explain that objects are not created from nothing, but rather that creation is a process of transformation from an original being (sat) which emerges into the multiplicity of forms that characterizes our everyday experiences. Uddālaka’s explanation of creation is often assumed to have influenced the satkāryavāda theory—the theory that the effect exists within the cause—which was accepted by the Sāṃkhya, Yoga, and Vedānta darśanas.

Later in his instruction to Śvetaketu, Uddālaka makes a series of inferences from comparisons with empirically observable natural phenomena to explain that the self is a non-material essence present in all living beings. He first uses the example of nectar, collected by bees from different sources, but when gathered together becomes an undifferentiated whole. Similarly, water flowing from different rivers merges together without distinction when reaching the ocean. Uddālaka then asks Śvetaketu to conduct two simple experiments. In the first he instructs his son to cut a banyan fruit, and then the seed within the fruit, only for his son to find that he cannot observe anything inside the seed. Uddālaka compares the fine essence of the seed, which cannot even be seen, to the self. Uddālaka then tells Śvetaketu to place some salt in water. When returning the next day, Śvetaketu cannot see the salt anywhere in the water, but by tasting the water he perceives that it is equally distributed throughout. Uddālaka concludes that, like salt in water, the self is not immediately discernible, but yet permeates the entire body. After each of these comparisons with natural phenomena Uddālaka brings attention back to Śvetaketu, emphasizing that the self operates the same way in him as it does in all living beings. Repeating the phrase ‘you are that’ (tat tvam asi) throughout his discourse, the thrust of Uddālaka’s teaching is that the self is both the essence that connects parts with the whole and the constant that remains the same even while taking on different forms. Thus, he offers an organic understanding of ātman, characterizing the self in terms of the life force that animates all living beings.

Yājñavalkya, the most prominent teacher in the Bṛhadāraṇyaka Upaniṣad, characterizes ātman more in terms of consciousness than as a life-giving essence. In a debate that pits him against Uddālaka—his senior colleague and, by some accounts, his former teacher (BU 6.3.7; 6.5.3)—Yājñavalkya explains that the self is the inner controller (antaryāmin), present within all sensing and cognizing, yet at the same time distinct (BU 3.7.23). Here, Yājñavalkya characterizes the self as that which has mastery over the otherwise distinct psycho-physical capacities. He goes on to explain that we know the existence of the self through actions of the self, through what the self does, not through our senses—that the self, as consciousness, cannot be an object of consciousness.

Another recurring theme in Yājñavalkya’s discussion with Janaka is that the self is described as consisting of various parts, but not reducible to any (e.g BU 4.4.5; see also TU 2.2.1). Similarly, in a creation myth at the beginning of the Āitareya Upaniṣad, ātman is cast as a creator god, who creates the various elements and bodily functions from himself (ĀU 1.3.11). As with Yājñavalkya’s teaching, in this passage the functions of the body and cognitive capacities are seen to be components of the self and even evidence of the self, but the self cannot be reduced to any particular part. Such examples emphasize that an understanding of the self cannot be attained through observing how the self operates in just one faculty, but by means of observing the self in relation to a number of psycho-physical faculties, and their relationship with each other. In addition to being portrayed as the agent or inner controller (antaryāmin) of sensing and cognizing, the self is characterized as an underlying base or foundation (pratiṣṭha) of all the sense and cognitive faculties. Throughout his teachings Yājñavalkya describes the self as being hidden or behind that which is immediately perceptible, suggesting that the self cannot be known by rational thought or described in conventional language because it can never be the object of thought or knowledge. Here, Yājñavalkya draws attention to the limitations of language, suggesting that because the self cannot be an object of knowledge it cannot have attributes, and therefore can only be described by using negative propositions.

Another prominent teacher of the self is Prajāpati, the creator god of Vedic ritual texts, who is recast in the Chāndogya Upaniṣad as a typically aloof guru, who is reluctant to disseminate his teachings (CU 8.7-12). Similar to Yājñavalkya, Prajāpati conceptualizes the self in terms of consciousness, describing ātman as the agent responsible for sensing and cognizing: ātman is ‘the one who is aware’ (CU 8.12.4-5). However, despite some similarities with Yājñavalkya’s teaching of ātman, Prajāpati seems to reject some of his positions. Prajāpati’s teaching is presented in the context of his instruction to the god Indra, taking place during several episodes over a period of more than one hundred years. In his first teaching Prajāpati defines the self as the material body, and sends Indra away thinking he has learned the true teaching. Before going back to the other gods, however, Indra realizes that this teaching cannot be true, and returns to Prajāpati to learn more. This pattern continues several times, before Prajāpati finally presents ātman as the ‘one who is aware’ of his final and true teaching. One of the teachings that Prajāpati presents as false, or at least as incomplete, is a description of ātman in terms of dreamless sleep, a teaching of the self that Yājñavalkya describes as the ‘highest goal’ and ‘the highest bliss’ in his instruction to King Janaka in the Bṛhadāraṇyaka Upaniṣad (4.3.32).

Despite the diversity among these teachings, most of the discussions represent a different set of concerns than those found in earlier Vedic texts, with many teachings focusing on the human body and individual person as opposed to the primordial or ideal body, as often discussed in Vedic rituals. Rather than assuming a correspondence between the human body and the universe, some of the teachings about the self in the Upaniṣads begin to show an interest in the fundamental essence of life.

4. Ātman and Brahman

Perhaps the most famous teaching of the self, the identification of ātman and brahman, is delivered by Śāṇḍilya in the Chāndogya Upaniṣad. After describing ātman in various ways, Śāṇḍilya equates ātman with brahman (CU 3.14.4), implying that if one understands brahman as the entire world, and one understands that the self is brahman, then one becomes the entire world at the time of death.

Although Śāṇḍilya’s teaching of ātman and brahman is often considered the central doctrine of the Upaniṣads, it is important to remember that this is not the only characterization either of the self or of ultimate reality. While some teachers, such as Yājñavalkya, also equate ātman with brahman (BU 4.4.5), others, such as Uddālaka Āruṇi, do not make this identification. Indeed, Uddālaka, whose famous phrase tat tvam asi is later taken by Śaṅkara to be a statement of the identity of ātman and brahman, never uses the term “brahman”—neither in his instruction to his son Śvetaketu, nor on any other of his many appearances in the Upaniṣads. Moreover, it is often unclear, even in Śāṇḍilya’s teaching, whether linking ātman with brahman refers to the complete identity of the self and ultimate reality, or if ātman is considered an aspect or quality of brahman. Such debates about how to interpret the teachings of the Upaniṣads have continued throughout the Indian philosophical tradition, and are particularly characteristic of the Vedānta darśana.

Furthermore, while most teachings about brahman assume that the world emerged from one undifferentiated abstract cosmic principle, there are a number of passages explaining creation in terms of a more materialist point of view, describing the world as coming forth from an initial natural element, such as water or air. The Bṛhadāraṇyaka Upaniṣad (5.1), for example, contains a teaching attributed to the son of Kauravyāyanī, depicting brahman as space. This same section of the Bṛhadāraṇyaka Upaniṣad (5.5.1) includes a passage describing the world as beginning from water. Similarly, in the Chāndogya Upaniṣad (4.3.1-2), Raikva traces the beginning of the world to wind in the cosmic sphere, and breath in the microcosm.

Returning to the self, and keeping in mind later philosophical developments, it is also worth noting that the Upaniṣads often present ātman in ways that contrast with the changeless and inactive descriptions of the self as articulated by traditions such as Sāṃkhya, Yoga, and Advaita Vedānta. As we have seen, the self can be characterized as both active and dynamic: as the inner controller (antaryāmin), the self is depicted as the agent or actor behind all sensing and cognizing faculties (for example, BU 3.7.23); while as a creator god, ātman is cast as a personal deity—closely resembling Prajāpati—from whom all creation emanates (BU 1.4.1; 1.4.17; TU 2.1; AU 1.1).

One feature of the self that is quite consistent throughout the Upaniṣads and continues to be shared by a number of subsequent schools of Hindu philosophy is that knowledge of ātman can lead to some sort of liberation or ultimate freedom. While the Sāṃkhya and Yoga school would conceptualize such emancipation as kaivalya—abstraction, autonomy from nature—and Advaita Vedāntins as freedom from ignorance (avidya), in the Upaniṣads the ultimate goal achieved through knowledge of the self is primarily freedom from death. Nonetheless, a prominent philosophical strand in the Upaniṣads, particularly in the teachings of Yājñavalkya, is that ātman dwells within the body when it is alive, that ātman, in one way or another, is responsible for the body being alive, and that ātman does not die when the body dies, but rather finds a dwelling place in another body. Such depictions seem to have been a catalyst for or been developed alongside early Buddhist conceptions of selfhood. The Buddhists explicitly rejected any notion of an indivisible and unchanging self, not only introducing the term “not-self” (anātman in Sanskrit; anattā in Pāli) to describe the lack of any fixed essence, but also explaining karmic continuity from one lifetime to the next in terms of the five skandhas—a theory maintaining that what Upanishadic thinkers take to be a unified self is really made of five components, all of which are subject to change.

5. Karma, Saṃsāra, and Mokṣa

Karma (“karman”) is another central concept in the Indian philosophical tradition that finds some of its first philosophical articulations in the Upaniṣads. Literally meaning ‘action’, karma emerges out of a ritual context where it refers to any ritual action, which, if performed correctly, yields beneficial results, but if performed incorrectly, brings about negative consequences. The Upaniṣads do not offer any explicit theory of karma, but do contain a number of teachings that seem to extend the notion of karma beyond the ritual context to more general understandings of moral retribution and of causality. Yājñavalkya, for example, when asked by Ārtabhāga about what happens to a person after death, responds that a person becomes good by good action and bad by bad action (BU 3.2.13). Here and elsewhere, one of Yājñavalkya’s fundamental assumptions is that present actions have consequences in the future and that our present circumstances have been shaped by our past actions. While this law-like character of karma suggests that the consequences of one’s actions shape one’s future, Yājñavalkya does not give any indication that the future is completely determined. Rather, he seems to suggest that one can create good consequences in the future by performing good actions in the present. In other words, Yājñavalkya presents karma more as a theory to promote good actions, than as a fatalistic doctrine in which the future is fixed.

While Yājñavalkya assumes that karma takes place across lifetimes, he does not attempt to explain the mechanisms of rebirth. In the Chāndogya Upaniṣad, however, King Pravāhaṇa Jaivali is more specific about how karma and rebirth operate, describing the link between them in terms of a naturalistic philosophy (CU 5.4-10). In a dialogue that also appears in the Bṛhadāraṇyaka Upaniṣad (BU 6.2.9-16), but without the explicit connection to karma, Pravāhaṇa discloses the teaching of the five fires (pañcāgnividyā) to Uddālaka Āruṇi. Pravāhaṇa’s instruction describes human life as part of a cycle of regeneration, whereby the essence of life takes on different forms as it passes through different levels of existence: when humans die, they are cremated and travel in the form of smoke to the other world (the first fire), where they become soma; as soma they enter a rain cloud (the second fire) and become rain; as rain they return to earth (the third fire), where they become food; as food they enter man (the fourth fire), where they become semen; as semen they enter a woman (the fifth fire) and become an embryo. According to Pravāhaṇa, those who know the teaching of the five fires follow the path of the gods and enter the world of brahman, but those who do not know this teaching, will follow the path of the ancestors and continue to be reborn.

Pravāhaṇa states that knowledge of the teaching of the five fires will affect the conditions of one’s future births. He explains that people who are pleasant will enter a ‘pleasant womb’ such as the womb of a brahmin, a kṣatriya, or a vaiśya. But that people of foul behavior can expect to enter the womb of a dog, a pig, or an outcaste (CU 5.10.7). In this teaching, Pravāhaṇa demonstrates the link between karma and rebirth by specifying different types of animals (dogs, pigs) and different types of matter (smoke, rain, food, semen) through which karma operates. By implication, karma not only applies to the causes and effects of human actions, but also includes non-human animals and other forms of organic and inorganic matter. Moreover, karma is not directed by a divine being, but rather is described as an independent, natural process. As such, karma is presented as an impersonal moral force that operates throughout the totality of existence, balancing out the consequences of good and bad action. Here, we see that Uddālaka Āruṇi’s teaching implies that everyone’s actions have moral consequences and that all the actions of humans and non-humans are interconnected.

Such discussions linking actions in one lifetime to consequences in a future one would become widely accepted in subsequent philosophical discourse—not only among Hindus, but also by Buddhists and Jains, and, to a certain extent, by the Ājīvikas. In subsequent developments across these traditions, karma would often be conceptualized in terms of intention and much of what we might describe as ethics was to be focused on ways to cultivate a state of mind that would generate positive rather than negative intentions.

Despite the development of ideas about karma, the earliest Upaniṣads generally do not contain the assumption that life is suffering (duḥkha), or illusion (māyā), or ignorance (avidyā)—views that would later dominate discussions of karma and rebirth. Nonetheless, we do see the introduction of the term “saṃsāra” in the comparatively late Kaṭha (3.7) and Śvetāśvatara (6.16) Upaniṣads. Literally meaning, ‘that which turns around forever’, saṃsāra refers to the cycle of birth, life, death, and rebirth. All living creatures, including the gods, are considered to be a part of saṃsāra. Accordingly, death is not considered to be final, and rebirth is an essential aspect of existence.

Closely related to saṃsāra is mokṣa, the concept that one can escape or be released from the endless cycle of repeated births. Similar to saṃsāra, the Upaniṣads do not contain an explicit theory about mokṣa, with the term “mokṣa” only assuming its connotations of liberation in the later texts (that is, SU 6.16). The Hindu darśanas would subsequently consider mokṣa to be a fundamental teaching of all the Upaniṣads, but the texts themselves, particularly the early ones, focus much more attention on securing wealth, status, and power in this lifetime than on describing existence as an endless cycle. They also tend to present life as desirable, and not as a condition from which people need release or escape. One of the most common soteriological goals is immortality, amṛta, which literally means ‘not dying’. The Upaniṣads describe immortality in different ways, including having a long life span, surviving death in the heavenly world, becoming one with the essential being of the universe, and being preserved in the social memory.

6. Ethics and the Upaniṣads

Philosophy in the Upaniṣads does not merely consist of abstract claims about the nature of reality, but is also presented as a way of living one’s life. In Yājñavalkya’s teaching to Janaka, for example, knowledge of ātman is associated with a change in one’s disposition and behavior. As we have seen, karma is characterized as a natural moral process, with knowledge of the self as a way out of that process. In this respect, a fundamental assumption throughout many teachings of the self is that it is untouched by karma. Yājñavalkya teaches Janaka that knowledge of the self is beyond virtuous (kalyāṇa) and evil (pāpa)—that, through knowledge of the self one reaches the world of brahman, where the good or bad actions of one’s life do not follow (BU 4.4.22).

Yājñavalkya explains that in the world of brahman a thief is not a thief, a murderer is not a murderer, an outcaste not an outcaste, a mixed-caste person (paulkasa) is not a mixed-caste person, a renunciate (śramaṇa) is not a renunciate, and an ascetic not an ascetic, that neither the good (puṇya) nor the evil (pāpa) follow him (BU 4.3.22; see also TU 2.9.1). In using these examples Yājñavalkya illustrates the degree to which knowledge of the self is beyond everyday notions of moral behavior. In other words, he seems to be saying that even if one has committed evil deeds, one can still be liberated from karma by means of knowing the self. Yet Yājñavalkya is not suggesting that one can continue to perform ‘evil’ deeds without suffering karmic retribution. Rather, as he asserts later in his discussion with Janaka: when one is knowledgeable, one necessarily acts morally. Yājñavalkya explains that a man who has proper knowledge becomes calm (śānta), restrained (dānta), withdrawn (uparata), patient (titikṣu), and composed (samāhita) (BU 4.4.23). Here Yājñavalkya characterizes knowledge of the self as a change in one’s disposition. In other words, one who is a knower of the self becomes a person of good character and—by definition—would not perform an evil action.

While Yājñavalkya talks about becoming calm, restrained, withdrawn, patient, and composed, these dispositions are not presented as virtues to cultivate for the sake of knowledge, but rather as consequences of knowing ātman. Subsequent texts would devote considerable attention to how one should cultivate oneself in order to achieve the highest knowledge. For example, both the eight-fold path of the Buddhist Nikāyas and the eight limbs in the Yoga Sūtra suggest that one needs to live a moral life in order to achieve true knowledge. In Yājñavalkya’s teaching about ātman, however, there is more attention paid to the objective of knowing the self than the ethical means of controlling the self.

Despite the lack of details about the path to knowledge, Yājñavalkya nevertheless connects knowledge of ātman with particular practices, explaining to Janaka that brahmins seek to know ātman by means of vedic recitation (vedānuvacana), sacrifice (yajña), gift-giving (dāna), austerity (tapas), and fasting (BU 4.4.22). Yājñavalkya elaborates, claiming that by knowing the self, one becomes a sage (muni), undertaking an ascetic and peripatetic lifestyle (BU 4.4.22). Here, Yājñavalkya implies that those who come to know ātman will become renunciates—that knowledge of the self not only brings about certain dispositions or a certain character, but also provokes a particular lifestyle. Similarly, in the Muṇḍaka Upaniṣad, Aṅgiras teaches Śaunaka that the self can be mastered by means of asceticism and celibacy, among other practices (MU 3.1.5).

With the connection between knowledge and lifestyle, there are notable gender implications of Upanishadic teachings. Yājñavalkya, for example, assumes that the main knowers of the self will be brahmin men, even claiming that through knowledge of the self one can become a brahmin (BU 4.4.23). The word “ātman” is grammatically masculine and teachings of the self are directed specifically towards a male audience and articulated in overtly androcentric metaphors (Black 2007: 135-41). Nonetheless, a number of teachings of the self suggest that true knowledge goes beyond gender distinctions. As we have seen, Uddālaka Āruṇi describes the self as an organic, universal life-force, while Yājñavalkya teaches that one who knows the self will see the self in all living beings (BU 4.4.23). It is also noteworthy that the Upaniṣads depict several women—such as Gārgī and Maitreyī—as participating in philosophical discussions and debates (Black 2007: 48-67; Lindquist 2008).

7. The Upaniṣads and Hindu Darśanas before Vedānta

The influence of the Upaniṣads on the so-called ‘Hindu’ darśanas is more oblique than explicit, with few direct references, yet with many of the dominant terms and concepts seemingly inherited from them. Many of the six main Hindu schools officially recognize the Upaniṣads as a source of philosophy in so far as they recognize śabda as a valid means for attaining knowledge. Śabda literally means ‘word’, but in philosophical discourse it refers to verbal testimony or reliable authority, and is sometimes taken to refer specifically to śruti. Despite the nominal acceptance of śabda as a pramāṇa, however, the Upaniṣads are only cited occasionally in the surviving texts, and rarely as a source to validate fundamental arguments, before the emergence of the Vedānta school in the 7^th century.

Notably, the Upanishadic notion of self—as a spiritual essence separate from the physical body—is generally accepted by the classical Hindu philosophical schools. The Nyāya and Mīmāṃsā darśanas, for example, which do not cite the Upaniṣads to prove its existence, nevertheless describe the self as an immaterial substance that resides in and acts through the body. In addition to conceptual similarities with certain passages from the Upaniṣads, both schools seem to consider the Upaniṣads as texts that specialize in the self. The Nyāya philosopher Vātsyāyana (c. 350-450 C.E.), for instance, characterizes the Upaniṣads as dealing with the self.

Similarly, the early texts of the Sāṃkhya and Yoga darśanas do not refer to the Upaniṣads when making their fundamental arguments, but do seem to inherit much of their terminology, as well as some of their views, from them. At the beginning of Uddālaka Āruṇi’s instruction to Śvetaketu in the Chāndogya Upaniṣad (6.2-5), for instance, he describes existence (sat) as consisting of three forms (rūpas): fire (red), water (white), and food (black)—a scheme that closely resembles the later Sāṃkhya doctrine of prakṛti and the three guṇas. The Śvetāśvatara Upaniṣad (4.5), the oldest extant text to use the word “sāṃkhya” (5.2), seems to build on Uddālaka’s three-fold scheme when describing the unborn as red, white, and black. Also, a number of core terms in Sāṃkhya philosophy first appear in the Upaniṣads, such as ahaṃkāra (CU 7.25.1) and the tattvas (BU 4.5.12), while some passages contain groups of terms appearing together in ways that are similar to how they appear in later Sāṃkhya texts: the Kaṭha Upaniṣad (3.10-11), for example, lists a hierarchy of principles including person (puruṣa), discernment (buddhi), mind (manas), and the sense capacities (indriyas).

A number of details about the practice of yoga, which would become more systematized by the Yoga darśana, are also first found in the Upaniṣads. The Kaṭha (3.3-13; 6.7-11) and Śvetāśvatara (2.8-11) Upaniṣads both contain some of the earliest descriptions of exercises for controlling the senses, breathing techniques, and bodily postures, with the Śvetāśvatara Upaniṣad (for example, 2.15-17) making explicit connections between yogic practice and union with a personal god—a connection that would be of central importance in the Yoga darśana. The Maitrī Upaniṣad (6-7) has the most extensive and systematic discussion of yoga in the Upaniṣads, containing a number of parallels with the Yoga Sūtra.

In addition to employing terms and concepts from the Upaniṣads, there are occasions when classical Indian philosophers refer to the Upaniṣads directly. Vātsyāyana, of the Nyāya school, quotes passages from the Bṛhadāraṇyaka and Chāndogya Upaniṣads when discussing mokṣa, the means of attaining it, and the stages of life. Additionally, the grammarian Patañjali (c.150 B.C.E.) argues that the study of grammar is useful for a correct understanding of passages from the Upaniṣads, and thus for attaining mokṣa.

Such examples indicate that the philosophers of classical Hindu philosophy knew the Upaniṣads quite well and would dip into the texts from time to time to provide an analogy or, occasionally, to support one of their arguments. However, the early surviving texts of the Nyāya, Vaiśeṣika, Mīmāṃsā, Sāṃkhya, and Yoga schools do not tend to use the Upaniṣads to validate their core positions. The Vaiśeṣika Sūtra (3.2.8), for example, agrees that the self is discussed in the Upaniṣads, but then argues that the proof of the existence of the self should not be established exclusively by means of śruti, but also can be determined through inference. Additionally, none of the early schools produced a commentary on the Upaniṣads, nor did any of them aim to offer an interpretation on the Upaniṣads as a whole. As such, the Upaniṣads provided a general philosophical framework, as well as serving as a repository for terms and analogies, but none of the early schools claimed the texts for themselves.

An interesting illustration of this point is that competing schools would sometimes recognize that their rival’s positions were also to be found in the Upaniṣads. The Nyāya philosopher Jayanta Bhaṭṭa even finds the positions of the heterodox Lokāyata darśana, or Materialist school, in the Upaniṣads. In the context of criticizing the validity of śabda as a pramāṇa, Bhaṭṭa argues that if śabda were a valid means for establishing knowledge, then even the doctrines of the Lokāyatas must be true, because their doctrines can be found in the Upaniṣads. Due to a lack of sources from the Lokāyata school, we do not know if they ever referred to the Upaniṣads in their own texts, but Bhaṭṭa’s argument is illustrative of a general reluctance of most of the early schools to put too much stake in śruti as a means of knowledge. His comments are also an acknowledgement that the Upaniṣads contain a variety of viewpoints.

8. The Upaniṣads and Vedānta

The oldest surviving systematic interpretation of the Upaniṣads is the Brahma Sūtra (200 B.C.E.—200 C.E.), attributed to Bādarāyaṇa. Although technically not a commentary (that is, it is a sūtra rather than a bhāṣya), the Brahma Sūtra is an explanation of the philosophy of the Upaniṣads, treating the texts as the source for knowledge about brahman. Despite being considered a Vedānta text, the Brahma Sūtra (a.k.a. Vedānta Sūtra) was composed centuries before the establishment of Vedānta as a philosophical school. The Brahma Sūtra uses the Upaniṣads to refute the position of dualism, as put forth by the Sāṃkhya school. Like Śaṅkara does later, the Brahma Sūtra (1.1.3-4) states that śruti is the source of all knowledge about brahman. Additionally, the Brahma Sūtra maintains that mokṣa is the ultimate goal as opposed to action or sacrifice.

Centuries later, the Vedānta darśana was the first philosophical school to attempt to present the Upaniṣads as holding a unified philosophical position. Vedānta means ‘end of the Vedas’ and is often used to refer specifically to the Upaniṣads. The school divides the Vedas into two sections: karmakānda, the section of spiritual exegesis (consisting of the Saṃhitās and the Brāhmaṇas), and jñānakānda, the section of knowledge (consisting of the Upaniṣads, and to a certain extent, the Āraṇyakas). According to the Vedānta school, the ritual section contains detailed instructions of how to perform the rituals, whereas the Upaniṣads contain transcendent knowledge for the sake of achieving mokṣa. There are three main branches of the Vedānta school: Advaita Vedānta, Viśiṣtādvaita Vedānta, and Dvaita Vedānta. Although these branches would put forth distinct philosophical positions, they all took śabda as the exclusive means to knowledge about its central doctrines and considered the Upaniṣads, the Brahma Sūtra, and Bhagavad Gītā as its core texts (prasthānatraya). Despite disagreeing with each other, all three of the most well known philosophers of the Vedānta school—Śaṅkara, Rāmānuja, and Madhva—wrote commentaries on the Upaniṣads, presenting them as having a single, and consistent philosophical position.

The most well-known philosopher of the Vedānta school was Śaṅkara (c. 700 C.E.), whose interpretations of the Upaniṣads made a major impact on the Indian philosophical tradition in the centuries after his lifetime and continued to dominate readings of the texts throughout the 19^th and early 20^th centuries. Śaṅkara was the main proponent of Advaita Vedānta, which put forth a position of non-dualism. According to Śaṅkara the fundamental teaching of the Upaniṣads is that ātman and brahman are one and the same.

For Śaṅkara, the Upaniṣads are not merely sources to back up his claims, but they also provide him with techniques for making his arguments. Śaṅkara takes the Upaniṣads as outlining methods for their own interpretation, following a number of literary criteria as clues for how to read the texts (Hirst 2005: 59-64). Consequently, even when he uses examples not found in the Upaniṣads, Śaṅkara can maintain that his arguments are based on scripture, for as long as he argues in the same way that the Upaniṣads do, he can claim that his arguments are based on his sources.

Despite the significance of Śaṅkara’s philosophy, it is important to note that his interpretation of the Upaniṣads was not the only one accepted by philosophers of the Vedānta school. Rāmānuja (c. 1000 C.E.), the main proponent of a form of Vedānta known as Viśiṣtādvaita, or qualified non-dualism, used the Upaniṣads to argue that ātman is not identical with brahman, but an aspect of brahman. Rāmānuja also found in the Upaniṣads a source for bhakti, as he identified the Upanishadic brahman with God. Two centuries later, Madhva (c. 1200 C.E.) used the Upaniṣads as a source for a dualist branch of the school, known as Dvaita Vedānta. Madhva interpreted brahman as an infinite and independent God, with the self as finite and dependent. As such, ātman is dependent upon brahman, but they are not exactly the same.

It is well known that the Vedānta school became extremely influential in shaping subsequent philosophical debates, and we may conjecture that the tendency for various Vedānta philosophers to use the Upaniṣads in support of their own positions, as well as in their criticisms of rival schools, prompted other schools to engage with the Upaniṣads more closely. This is illustrated by the fact that schools such as Nyāya and Sāṃkhya, which previously seem to have relied very little on the Upaniṣads, began invoking them to counter the claims of Advaita Vedānta.

The Nyāya philosopher Bhāsarvajña (c. 850-950 CE), for example, quotes some verses from the Upaniṣads to support his position of a distinction between the ordinary and supreme sense of self when arguing with the Advaita position of non-dualism. Another Nyāya philosopher, Gaṅgeśa (c. 1300 C.E.), seems to be quoting from the Upaniṣads to back up the claim that karmic retribution is not binding for those who know the self—a position stated by Yājñavalkya (BU 4.4.23). Moreover, a number of Sāṃkhya and Yoga philosophers use the Upaniṣads in an attempt to make their schools more compatible with Vedānta. The Sāṃkhya philosopher Nāgeśa (c. 1700-1750), for example, draws from the Upaniṣads—as well as the other two source texts of the Vedānta school, the Bhagavad Gītā and Brahma Sūtra—to argue that the Vedānta and Sāṃkhya schools do not contradict each other. This trend can also be found in the Sāṃkhyasūtra (c. 1400-1500 C.E.), which argues that the identification of brahman and ātman was a qualitative identity, but not a numerical one—seemingly defending Sāṃkhya against Śaṅkara’s criticism that the Sāṃkhya doctrine of multiple selves contradicts the Upaniṣads. Interestingly, this argument suggests that Sāṃkhya philosophers not only felt the need to show that their positions did not contradict the Upaniṣads, but also that they basically accepted the Advaita Vedānta reading of the Upaniṣads.

9. The Upaniṣads as Philosophy

As noted above, many of the Upaniṣads are composite and fragmented, and therefore lacking a coherent philosophical position. Moreover, the teachers portrayed in the Upaniṣads do not seem to make linear arguments that start with premises and build to larger conclusions, but rather tend to make points through analogies and metaphors, with many core ideas presented as truths or insights known to particular teachers, not as logical propositions that can be independently verified. Nonetheless, in a number of sections of the texts, there appear to be implicit philosophical methods in place.. We have already noted that Yājñavalkya’s discussion of the self is based on a reflective introspection (see also MuU 3.1.8-9). The early Upaniṣads do not contain passages explicitly articulating method, but with the development of yoga and meditation in the later texts, introspection begins to be formalized as a philosophical mode of enquiry. Also, many of Uddālaka Āruṇi’s descriptions of ātman are derived from his observations of the natural world.

In addition to providing a repository of terms, concepts, and, to a certain degree, philosophical methods, from which subsequent philosophical schools would draw, the Upaniṣads were also influential in the development of the practice of debates, which would become the defining social practice of Indian philosophy. Although the texts do not discuss debate reflectively, a number of the most important teachings are articulated within the context of discussions between teachers and students, and verbal disputes among rival brahmins. In some dialogues, there is a dialectical relationship between the arguments of competing interlocutors, indicating that the dialogical presentation of teachings was a way of formulating philosophical rhetoric (Black 2015). In this way, debate is another way by which the Upaniṣads extend ideas first articulated in the context of the Vedic ritual into a more philosophical discourse.

10. The Upaniṣads in the Modern Period

The Upaniṣads are some of the most well-known Indian sources outside of India. Their first known translation into a non-Indian language was initiated by the Mughal prince Dārā Shūkōh, son of the emperor Shah Jahan. This Persian translation, known as the Sirr-i Akbar (the Great Secret), consisted of fifty texts, including the Vedic upaniṣads, many of the yoga, renunciate, and devotional upaniṣads, as well as other texts, such as the Puruṣa Sūkta hymn of the Ṛgveda and some material from unidentified sources. Dārā Shūkōh considered the Upaniṣads to be the sources of Indian monotheism and he was convinced that the Koran itself referred to the Upaniṣads.

Henry Thomas Colebrooke’s translation of the Aitareya Upaniṣads in 1805 was the first rendering of an upaniṣad into English. Rammohan Roy subsequently translated the Kena, Īśā, Kāṭha, and Muṇḍaka Upaniṣads into English, while his Bengali translation of the Kena Upaniṣad in 1816 was the first rendering of an upaniṣad into a modern Indian language.

Roy used the introductions of his translations into both Bengali and English to promote the reformation of Hinduism, endorsing the values of reason and religious tolerance, while criticizing practices such as idolatry and caste hierarchy. Roy felt that contemporary religion in India was in decline and hoped that his translations could provide Hindus with direct access to what he considered to be the true doctrines of Hinduism. The Upaniṣads first reached Europe in the modern period through the French philologist Abraham Hyacinthe Anquetil-Dupperon’s translation of the Sirr-i Akbar into Latin, which was published in 1804. It was Anquetil-Dupperon’s text, known as the Oupnek’hat. which was read by the German philosopher Arthur Schopenhauer, the first major European thinker to engage explicitly with Indian sources. Schopenhauer considered the Upaniṣads, Plato, and Kant to be the three major influences on his work and is known to have kept a copy of Anquetil-Dupperon’s translation by his bedside table, reflecting that the Upaniṣads were his consolation in life and would equally be his consolation in death.

11. References and Further Reading

a. Primary Sources

Buitenen, J. A. B. van, tr. 1962. The Maitrāyaṇīya Upaniṣad: A Critical Essay with Text. Translation & Commentary. The Hague: Mouton & Co.
Deussen, Paul, tr. 1980 (originally published in 1897). Sixty Upaniṣads of the Veda, translated by V. M. Bedekar and G. B. Palsule. Delhi: Motilal Banarsidass.
Eggeling, Julius, tr. 1994 (originally published in 1882-97). Śatapatha Brāhmaṇa, Vols. 12, 26, 41, 43 and 44 (5 parts of the Sacred Books of the East; Delhi: Motilal Banarsidass).
Hume, Robert, tr. 1975 (originally published in 1921). The Thirteen Principal Upanishads. London: Oxford University Press.
Keith, A. B., tr. 1995 (originally published in 1909). Aitareya Āraṇyaka. London: Oxford University Press.
Müller, F. Max, tr. 2000 (originally published in 1897). The Upanishads Parts 1-2. Delhi: Motilal Banarsidass.
Olivelle, Patrick, tr. 1992. Saṃnyāsa Upaniṣads. New York: Oxford University Press.
Olivelle, Patrick, tr. 1996. The Upaniṣads. Oxford: Oxford University Press.
Olivelle, Patrick, tr. 1998. The Early Upaniṣads: Annotated Text and Translation. New York: Oxford University Press.
Oertel, H., tr. 1897. ‘The Jaiminīya or Talavakāra Upaniṣad Brāhmaṇa’. Journal of the American Oriental Society 16: 79-260.
Radhakrishnan, Sarvepalli, tr. 1992 (originally published in 1953). The Principal Upaniṣads. New Jersey: Humanities Press.
Roebuck, Valerie, tr. 2004. Upaniṣads. Harmondsworth: Penguin.

b. Secondary Sources

Black, Brian. 2007. The Character of the Self in Ancient India: Priests, Kings, and Women in the early Upaniṣads. Albany: State University of New York Press.
Black, Brian. 2011. “Ambaṭṭha and Śvetaketu: Literary Connections between the Upaniṣads and Early Buddhist Narratives.” Journal of the American Academy of Religion, Vol. 79, No. 1: 136–161
Black, Brian. 2011. “The Rhetoric of Secrecy in the Upaniṣads.” Essays in Honor of Patrick Olivelle, edited by Steven Lindquist. Florence: Florence University Press: 101-125.
Black, Brian 2015. “Dialogue and Difference: Encountering the Other in Indian Religious and Philosophical Sources.” Dialogue in Early South Asian Religions: Hindu, Buddhist, and Jain Traditions, edited by Brian Black and Laurie Patton Farnham, UK: Ashgate: pp. 243-257.
Brereton, Joel. 1990. “The Upanishads.” Approaches to the Asian Classics, edited by Wm. T. de Bary and I. Bloom. New York: Columbia University: 115-135.
Cohen, Signe. 2008. Text and Authority in the Older Upaniṣads. Leiden: Brill.
Deussen, Paul. 2000 (originally published 1919). The Philosophy of the Upanishads. Delhi: Oriental Publishers.
Ganeri, Jonardon. 2007. The Concealed Art of the Soul: Theories of Self and Practices of Truth in Indian Ethics and Epistemology. Oxford: Oxford University Press.
Hirst, J. G. Suthren. 2005. Śaṃkara’s Advaita Vedānta: A Way of Teaching, London: RoutledgeCurzon.
Killingley, Dermot. 1997. “The Paths of the Dead and the Five Fires.” Indian Insights: Buddhism, Brahmanism and Bhakti: Papers from the Annual Spalding Symposium on Indian Religions, edited by Peter Connolly and Sue Hamilton. London: Luzac Oriental.
Lindquist, Steven. 2008. “Gender at Janaka’s Court: Women in the Bṛhadāraṇyaka Upaniṣad Reconsidered.” Journal of Indian Philosophy. Vol. 36, No. 3: 405-426.
Lindquist, Steven. 2011. “Literary Lives and a Literal Death: Yājñavalkya, Śākalya, and an Upaniṣadic Death Sentence.” Journal of the American Academy of Religion. Vol. 79, No. 1: 33-57.
Olivelle, Patrick. 1999. “Young Śvetaketu: A Literary Study of an Upaniṣadic Story.” Journal of the American Oriental Society. Vol. 119, No.1: 46-70.
Olivelle, Patrick. 2009. “Upaniṣads and Āraṇyakas.” Brill’s Encyclopaedia of Hinduism, edited by Knut Jacobsen. Leiden: Brill 41-55.
Olivelle, Patrick. 2012. “Kings, Ascetics and Brahmins: the Socio-Political Context of Ancient Indian Religions.” Dynamics in the History of Religions between Asia and Europe, edited by Volkhard Krech and Marion Steinicke. Leiden: Brill: 117-136.
Patton, Laurie. 2004. “Veda and Upaniṣad.” The Hindu World, edited by Sushil Mittal and Gene Thursby. London: Routledge: 37-51.
Thapar, Romila. 1993. “Sacrifice, Surplus, and the Soul.” History of Religions. Vol. 33, No. 4: 305-324.
Witzel, Michael. 2003. “Vedas and Upaniṣads.” The Blackwell Companion to Hinduism, edited by Gavin Flood. Oxford: Blackwell Publishing: 68-98.

Author Information

Brian Black
Email: b.black@lancaster.ac.uk
Lancaster University
United Kingdom

Totalitarianism

Totalitarianism is best understood as any system of political ideas that is both thoroughly dictatorial and utopian. It is an ideal type of governing notion, and as such, it cannot be realised perfectly.

Faced with the brutal reality of paradigmatic cases like Stalin’s USSR and Nazi Germany, philosophers, political theorists and social scientists have felt not just intellectually motivated but morally compelled to explain the causes and implications of totalitarianism. This has been in part an attempt to explain the socio-political phenomenon in itself, as well to develop an intellectual tool in the arsenal of democracy.

Diverse philosophical perspectives have been employed. They share the important common denominator of an appeal to the value of human life, critical thought, and a pluralistic society. Many of the key figures among the anti-totalitarian thinkers discussed here were European Jewish refugees who escaped totalitarian systems. Many who work on this question have been motivated by a desire to come to grips, philosophically, with what is undoubtedly the greatest intellectual justification for mass murder in history: the twentieth century totalitarian state.

Introduction
Second World War and Cold War Thought
Later Work
1. Judith Shklar’s Liberalism of Fear
2. Avishai Margalit on the Decent Society and Totalitarianism
References and Further Reading

1. Introduction

The term “totalitarianism” dates to the fascist era of the 1920s and 1930s, and it was first used and popularised by Italian fascist theorists, including Giovanni Gentile. It progressively came to be extended to include not just extreme utopian dictatorships of the far right, but also Communist regimes, especially that of the Soviet Union under Joseph Stalin. It is still frequently associated with Cold War thought of the 1940s and 1950s, a period during which it was most widely utilised as a governing concept, although its philosophical implications transcend that era’s political fears and rhetoric. As used in this article, “totalitarianism” will refer to the most extreme modern dictatorships possessing perfectionistic and utopian conceptions of humanity and society.

Totalitarianism’s appeal is linked to a variety of perennial values and intellectual commitments. Although a distinctly modern problem, proto-totalitarian notions may be found in a variety of philosophical and political systems. In particular, Plato’s utopian society discussed in the Republic featured a caste-based society in which both social and moral order are to be maintained and fostered through strict political control and eugenics.

In the seventeenth century, absolutists and royalists such as Thomas Hobbes and Jacques Bossuet advocated, in various ways, a strong centralized state as a guarantor against chaos in conformity with natural law and biblical precedent. However, it was only in the early twentieth century that totalitarianism, properly understood, became a conceptual and political reality. Thinkers as diverse as Carl Schmitt in Germany and Giovanni Gentile in Italy helped to lay the foundations of fascist ideology, stressing the defensive and unifying advantages of dictatorship. In the nascent USSR, Vladimir Lenin developed Marx’s ideas from a potentially totalitarian base into a full blown communist ideology, in which Marx’s own phrase “the dictatorship of the proletariat” was interpreted explicitly to mean the dictatorship of the Soviet Communist Party.

The term “totalitarianism” is also sometimes used to refer to movements that in one way or another manifest extreme dictatorial and fanatical methods, such as cults and forms of religious extremism, and it remains controversial in scope. It has been a topic of interdisciplinary interest, with various typologies offered by political scientists (see Friedrich and Brzezinski 1956 for the locus classicus of such approaches).

This article will primarily examine some key models and criticisms of the problem of totalitarianism defended by preeminent philosophers, as well as the thoughts of some key and representative scholars in other disciplines whose work is of philosophical significance. Their perspectival range encompasses strongly liberal, intellectual historical, neo-Marxist and pragmatist approaches. All have wished to distinguish totalitarianism sharply from liberal democratic ideals and society.

2. Second World War and Cold War Thought

a. The American Pragmatists on the Values of Pluralism and Democratic Debate

It is by no means surprising that American pragmatists should have responded to the challenge of totalitarianism in the mid-twentieth century. Not just Cold War realities, but philosophical method and values were key factors in this response. Given its strong emphasis on experimental method and the value of individual experience and fallibilism in epistemology, pragmatism would seem prima facie inimical to dictatorship.

i. John Dewey on Democratic Method

Philosophy, in order to be at its best, requires both critical thinking and democratic action, on any interpretation of Dewey’s pragmatism. In a number of works published between the 1930s and his death in 1952, John Dewey felt compelled to defend democracy against the growth and expansionism of totalitarianism, and this engagement was in keeping with Dewey’s passion for social activism and public education over the course of his long life. Dewey’s action on this matter included chairing the 1937 Dewey Commission that critically examined Soviet charges against Leon Trotsky.

Dewey had been interested in the problems of democracy for some time when he wrote his 1939 democratic credo I Believe. The rapid expansion of fascism and the Soviet Great Purge of the mid to late 1930s alerted Dewey to imminent threats to individual freedom from diverse quarters. In this short work, Dewey stated that he felt compelled to emphasize the fundamental value and importance of individuals over the state in the face of creeping totalitarianism. He here affirmed the pragmatist conviction that experience and institutions tempered by democratic problem solving ought to be primary in social philosophy. Dewey held that such problem solving, in order to be ethically compelling, must be respectful of the fundamental primacy of individual rights. It must furthermore involve an important element of negotiation and compromise over dogmatic assertion.

Furthermore, Dewey held that the rise of modern dictatorships was in part a reaction to an excessive form of individualism that isolated human beings from each other, and that offered only modern capitalism in mass society as a choice:

The negative and empty character of this individualism had consequences which produced a reaction toward an equally arbitrary and one-sided collectivism. This reaction is identical with the rise of the new form of political despotism. The decline of democracy and the rise of authoritarian states which claim they can do for individuals what the latter cannot by any possibility do for themselves are the two sides of one and the same indivisible picture.

Political collectivism is now marked in all highly industrialized countries, even when it does not reach the extreme of the totalitarian state….[the individual] is told that he must make his choice between big industry and finance and the big national political state. (Dewey, 1993: 235-236).

ii. Sidney Hook on Heresy versus Conspiracy

Sidney Hook was Dewey’s prime disciple in the application of pragmatism to anti-totalitarian thought. In his highly controversial 1953 book, Heresy, Yes—Conspiracy, No, Hook incurred the allegation of McCarthyism due to his advocacy of a firm line against the American Communist Party, especially within academia and educational trade unions.

Hook, who was social democratic for much of his career, distinguished between a genuinely progressive left that operates in a heretical and democratic matter, and the Stalinist American Communist Party and its fellow travellers. Heresy, for Hook, is an entirely legitimate expression of dissent on controversial matters. However, he held the Communist movement to be inherently conspiratorial and subversive of the very ground rules of democracy, and this led him to advocate restrictions against its carrying out policies and actions inimical to elected government. In effect, Hook affirmed the legitimacy of democracy protecting itself not just from external aggression, but from internal subversion in the interest of foreign aggressors, such as the USSR. He took this to be in keeping with the pragmatist emphasis on democratic consensus and open debate in the interest of solving social problems, a methodology diametrically opposed to Stalinism.

Hook’s core thesis of muscular liberalism is powerfully stated in a New York Times Magazine article subsequently expanded into a 1953 book:

Liberalism in the twentieth century must toughen its fibre, for it is engaged in a struggle on many fronts. Liberalism must defend the free market in ideas against the racists, the professional patrioteer, and those spokesmen of the status quo who would freeze the existing inequalities of opportunity and economic power by choking off criticism.

Liberals must also defend freedom of ideas against those agents and apologists of Communist totalitarianism, who, instead of honestly defending their heresies, resort to conspiratorial methods of anonymity and other methods of fifth columnists. (Hook, 1950: 143).

The usual objections to pragmatism are pertinent to its Deweyan anti-totalitarian strain. These revolve around the claims that pragmatism has an insufficiently robust and general conception of truth and evidence to serve as an adequate foundation for ethical and political principles. Ethical foundationalists in particular, have rejected pragmatism as possessing excessively relativistic implications, and for lacking a strong sense of moral tradition.

Contemporary pragmatists have, in different ways, attempted to respond to such criticisms by stressing the great value of democratic society in upholding value pluralism and open-ended inquiry:

…democracy is not just one form of social life among other workable forms of social life; it is the precondition for the full application of intelligence to the solution of social problems. (Putnam, 1992: 180).

Whether or not pragmatist anti-totalitarianism succeeds in its defence of democracy and individual rights is thus deeply linked to the coherence and adequacy of pragmatist defenses of a fallibilistic and at times flexible conception of truth in ethics and politics. If there is no need for traditional ethical foundationalism in upholding the value of democracy against tyranny, then the pragmatist case against totalitarianism may be seen to be a serious methodological option.

b. The British Liberal Defence of the Open Society and Pluralism

Although both Karl Popper and Isaiah Berlin were born outside of Great Britain, they were both leading theorists of anti-totalitarianism in British academia. The Israeli scholar, Jacob L. Talmon, was British trained, and is best seen as applying the British liberal tradition to the Enlightenment. There are clear affinities between their positions on this issue, which are best seen as continuations of the British liberal tradition well into the twentieth century, when it faced the challenge of the totalitarian state. The three representatives of British liberalism discussed here shared a commitment to individual liberty, wariness of state power, and an evident suspicion of what they took to be the collectivist and utopian excesses of various Continental thinkers.

i. Karl Popper’s Indictment of Historicism

In several works, Karl Popper articulated a vigorous defence of liberal democracy over dictatorship. In his early work there is a particular emphasis on the unscientific and ultimately illogical character of all forms of historical determinism and collectivism. In The Poverty of Historicism, he stressed the philosophical errors of utopianism, and what he termed “historicism”—assuming or attempting to argue for the existence of deterministic historical laws, and the possibility of deriving accurate predictions from them. These predictions are purportedly scientific or metaphysical, and for Popper, they betray an epistemic confusion between falsifiable and limited predictions based on evidence, and “oracular prophesies” masquerading as science or philosophical rationality.

In keeping with his philosophy of natural science, Popper urges us to shun certainty and dogmatism in social science and history, in favour of a piecemeal approach characterised by attention to particulars and the trial and error methods of fallibilism. Such an approach is not only conducive to precise and clear social explanations; Popper defends it as a philosophical shield against tyranny as well. For it is precisely the immodesty of overgeneralising to alleged rigid laws in history that has led even great philosophers and other thinkers to commit the error of historicism, which is a key component of totalitarian and fanatical patterns of thought.

Popper defines “historicism” as a theory of history that affirms the existence of deterministic laws from which iron-clad predictions can be derived. He thus accuses purportedly scientific theorists of history, including Karl Marx, of misinterpreting trends as inexorable laws, thereby producing unscientific and potentially irrational schemes of historical development. When coupled with grandiose or holistic schemes of social engineering, such approaches, for Popper, combine bad social science with lethal utopianism. We ought, he claims, to opt for “piecemeal engineering” employing trial and error experimentation, openness to constructive criticism, and the falsification of our programs:

[commitment to holistic or utopian social engineering] prejudices the Utopianist against certain sociological hypotheses which state limits to institutional control….problems connected with the uncertainty of the human factor must force the Utopianist, whether he likes it or not, to try to control the human factor by institutional means, and to extend his programme, so as to embrace not only the transformation of society, according to plan, but also the transformation of man. (Popper, 1960: 69-70).

Although written slightly later than The Poverty of Historicism, Popper’s The Open Society and its Enemies was published during the Second World War. It is therefore best seen as an intellectual contribution to the Allied cause against fascism, which was subsequently readily adapted to the struggle against Soviet dictatorship during the Cold War. Both works are permeated by a sense that democracy was under fire and could potentially be annihilated by its totalitarian rivals.

Here Popper broadens his critique of totalitarianism by indicting major figures of the Western philosophical tradition, notably Plato, Hegel and Marx. All three, he held, were guilty of collectivist and utopian social projects. In diverse ways, Plato’s notion of guardianship and the philosopher kings, Hegel’s glorification of the militaristic nation state, and Marx’s belief in the inevitability of class warfare and violent revolution all share a misguided common denominator: the historicist belief in holistic explanations derived from alleged laws of historical inevitability. In place of this, Popper recommended a non-dogmatic “critical rationalism,” within an open society that respects debate and a quest for truth and knowledge. This method ought to at all costs be substituted for historicist and utopian grand schemes of social science and philosophy of history that are characterised by a kind of oracular faith in their own future prophesies, dogmatism, and immunity to falsification.

Popper explained the appeal of historicism as a product of a false conception of the power of social science and historiography, combined with alienation and dissatisfaction:

Why do all these social philosophies support the revolt against civilization? And what is the secret of their popularity? Why do they attract and seduce so many intellectuals? I am inclined to think that the reason is that they give expression to a deep felt dissatisfaction with a world which does not, and cannot, live up to our moral ideals and to our dreams of perfection. The tendency of historicism (and of related views) to support the revolt against civilization may be due to the fact that historicism itself is, largely, a reaction against the strain of our civilization and its demand for personal responsibility. (Popper, 2011: xxxix).

Popper’s faith in rationalism and the open society has been criticised by Leszek Kołakowski for not taking into account democracies’ propensity towards self-destruction. Kolakowski holds that the diverse ends of open societies can come into conflict with each other, thereby vitiating attempts to combine liberal values coherently. He writes of Popper’s model:

The open society is described less as a state constitution and more as a collection of values, among which tolerance, rationality, and a lack of commitment to tradition appear at the top of the list. It is assumed, naively so I think, that this set is wholly free of contradictions, meaning that the values that it comprises support each other in all circumstances or at least do not limit each other. (Kołakowski, 1990: 164).

This criticism points to the question of value pluralism as discussed by Isaiah Berlin: how can a multiplicity of values, some of them potentially mutually exclusive, provide a coherent and adequate buffer against repressive, totalitarian state power?

ii. Isaiah Berlin on Liberty

Throughout his career, Isaiah Berlin devoted a considerable amount of attention to the question of totalitarianism. He saw it as one of the most important features of twentieth century history, and as the logical outcome of an excessive devotion to what he took to be a dangerously paternalistic conception of liberty.

In a key work on the subject (1969, reprinted and expanded in 2002), Berlin drew an important distinction between the negative and positive conceptions of liberty or freedom:

The first of these political senses of freedom or liberty…which (following much precedent) I shall call the “negative sense,” is involved in the answer to the question “What is the area within which the subject—a person or group of persons—is or should be left to do or be what he is able to do or be, without interference by other persons?” The second, which I shall call the “positive” sense, is involved in the answer to the question ‘What, or who, is the source of control or interference that can determine someone to do, or be, this rather than that?’ The two questions are clearly different, even though the answers to them may overlap. (Berlin, 2002: 169).

He thus held that the former is the foundation of the pluralistic liberalism that he wished to defend, and that the latter is a very different notion, involving obligatory self-realisation through the perfection of the individual and society in accordance with natural or historical necessity. Whereas negative liberty is a cornerstone of toleration, openness to new knowledge and individual rights, positive liberty, for Berlin, is the state’s paternalistic high road to totalitarianism.

Long associated with despotic and dictatorial regimes, positive freedom had, by the mid-twentieth century, formed part of the justification for both communist and fascist dictatorships. By claiming deterministic justifications including a truly scientific conception of historical law, social Darwinism or the will of the people, totalitarian states of both the extreme left and the extreme right justified the murder of millions in the name of a unitary and static utopian future that they saw as set and predictable.

For Berlin, this totalitarian development of positive liberty was not an aberration, but a logical conclusion. It emerged in a particularly lethal form in the twentieth century due to its central role in the justification of illiberal and non-humanistic ideologies, including communism, fascism, and the sort of extreme romantic nationalism and clericalism already present prototypically in the thought of nineteenth century figures such as Joseph de Maistre.

Against this, Berlin urged humanity to seek a decent society with pluralistic values, thus eschewing utopian perfectionism. This he thought to be characterised by a fallibilistic conception of knowledge, peaceful trade-offs, and the rejection of nihilism and relativism in favour of common values across genuinely diverse ways of life. Such a society would, he held, resolve to maintain a pluralistic balance of values against any and all attempts to sacrifice entire groups of people in the name of a future that can never be fully predicted.

A key criticism of a stark division between negative and positive liberty has been offered by Charles Taylor (1985). He claims that the terms have been used in an excessively narrow way so as not to do justice to the complexity of human freedom. In particular, the existence of what he has termed “strong evaluations” (Taylor, 1985: 220). That is, important qualitative distinctions in the ranking of individuals’ desires and projects, would seem to render incomplete any use of the idea negative freedom as essentially a lack of coercion or obstacle. For Taylor, this conception of negative liberty stems from diverse and likely parallel sources in the Western philosophical tradition, such as Hobbes and Bentham. He claims that in order to do justice to freedom, even sophisticated liberals such as Mill have made significant use of concepts of self-development and improvement, and this implies some degree of positive liberty. So positive liberty is best understood as a part of individual freedom and flourishing, and not necessarily a component of totalitarianism.

The extent to which the state should promote it remains an important question. Understood along the lines indicated by Taylor, it may be a value to be realized through self-development in a more democratic society. This is in keeping with what not only Taylor, but other thinkers, claim.

iii. Jacob Talmon on Totalitarian Democracy

In 1952, Jacob L. Talmon published a liberal indictment of those views of eighteenth century thought that saw the French Enlightenment as manifesting overwhelmingly liberal tendencies.

Talmon argued, in The Origins of Totalitarian Democracy, that both liberal-empirical and totalitarian tendencies were significant and influential in European thought by the time of the French Revolution. In particular, he held that key aspects of the thought of Jean-Jacques Rousseau and lesser known radical Babouvist egalitarian Enlightenment figures such as Gabriel Bonnot de Mably, and Étienne-Gabriel Morelly, are best seen as a foreshadowing of twentieth century totalitarianism.

Like Berlin, Talmon stresses the fundamental divergence between individualist and collectivist or statist conceptions of freedom. He divided early modern democratic thought into two broad categories: “liberal” and “totalitarian” democracy. The former led, through a long process of parliamentary development across the nineteenth century, to the institutions regarded as democratic in the mid-twentieth century. The liberal democratic thought of Benjamin Constant and Alexis de Tocqueville in France, as well as John Stuart Mill in England, were instrumental in developing this political tradition to a philosophical apogee. Talmon traced its origins in part to John Locke’s defense of individual property rights. Totalitarian democracy, on the other hand, developed largely from radical French Enlightenment thought through Babeuf and the Jacobin stream of the French Revolution, and through nineteenth and early twentieth century Marxism. Talmon describes it as a form of “political Messianism.”

Liberal democracy has stressed the importance of individual human rights, empiricism and the rule of law from its beginnings. It advocates piecemeal reform and the application of rationality to arrive at optimal political remedies to social problems. Totalitarian democracy from Robespierre and the Jacobins through Karl Marx and into the twentieth century has been utopian, collectivist and statist. Talmon furthermore holds it to be characterised by historical determinism and a notion of a single comprehensible truth in political life.

The two intellectual tendencies both claim to promote freedom to the highest degree, but differ greatly in their conceptions of legitimate freedom. Both schools affirm the supreme value of liberty, but whereas the one finds the essence of freedom in spontaneity and the absence of coercion, the other believes it to be realized only in the pursuit and attainment of an absolute collective purpose. Liberal democrats believe that, in the absence of coercion, men and society may one day reach through a process of trial and error a state of ideal harmony. In the case of totalitarian democracy, this state is precisely defined, and is treated as a matter of immediate urgency, a challenge for direct action, an imminent event:

[Human beings,] in so far as they are at variance with the absolute ideal they can be ignored, coerced or intimidated into conforming, without any real violation of the democratic principle being involved. (Talmon, 1986: 2-3).

Talmon devotes considerable attention to what he takes to be Rousseau’s totalitarian tendencies in The Social Contract. Talmon finds especially collectivist Rousseau’s notion of the “general will” being over and above society and representing the highest aspirations of humanity. Furthermore, the idea that the individual can only find true liberation through the state and its supreme “Legislator” is the high road to dictatorship, for Talmon. Rousseau is thus seen as a merciless collectivist, willing to “force people to be free” in order to create a new and perfected type of human being. This ideal involves a notion of democracy as the constant and unanimous participation of the citizens of an ideal state in the acting out of the general will, thereby realising true democratic citizenship.

Talmon’s conception of the origins of totalitarianism in the French Enlightenment and its revolutionary heritage has been challenged on various grounds. The Canadian scholar C.B. Macpherson, influenced by Marxism, argued that Talmon erred in stressing ideas over class and social realities, and in thus making too strong a causal claim in linking notions of natural order and political unanimity to inevitable totalitarianism. Furthermore, he claimed that the Jacobins instituted a type of early totalitarian rule largely in response to the social pressures of revolutionary power and foreign counter-revolutionary invasion.

In effect, the true causes of historical change are thus seen as grounded in class and general social trends, and not merely in purely philosophical or ideological causes. This criticism implies holds that understanding key ideas and movements requires an understanding of their class background:

A petit-bourgeois movement like Jacobinism, or a proletarian movement still based on the same individualist assumptions (like Bavouism) is particularly liable to demand a completely general unanimity at a time when it is least possible. It might be argued that it was the petit-bourgeois character of these ideologies, rather than the assumption of a natural order, that led so readily to totalitarian dictatorship. (Macpherson, 1952: 57).

This criticism of Talmon’s core thesis bears affinities with a critique of Arendt, and it raises the general question of the social causation of ideas in an interesting way. To what extent are philosophical ideas responsible solely, or at least primarily, for mass movements throughout history, including totalitarianism? If Talmon and Arendt are right, they certainly possess sufficient causal potency to be determining factors in social and political development. If their critics hold the high ground, they have inflated the importance of secondary or even epiphenomenal notions and properties to an unrealistic station.

c. Hannah Arendt on the Origins and Implications of Totalitarianism

In her seminal 1951 book, Hannah Arendt attempted to show how totalitarianism emerged as a distinctly modern utopian problem in the twentieth century, growing out of a lethal combination of imperialism, anti-Semitism and extreme statist bureaucracies. As much a work of intellectual history as political philosophy, The Origins of Totalitarianism jarred many due to its indictment of European civilization during a period of post-war reconstruction. Arendt held that totalitarianism was not a reactionary aberration, an attempt to turn back the clock to earlier tyrannies, but rather a revolutionary form of radical evil explicable by particularly destructive tendencies in modern mass politics. The atomisation of lonely individuals and the receptivity to propaganda of mass society in the modern age makes it an ongoing temptation to be resisted through critical thinking and the affirmation of fundamental human values.

Tracing what she took to be the prime causes of totalitarianism to the nineteenth century, Arendt focused on the rise of imperialism and political anti-Semitism, and the concomitant decline of both the remnants of the feudal order and the nation state. Imperialism and anti-Semitism both drew from racist and Social Darwinist wellsprings in their repudiation of unity through language, culture, and universal rights in favour of biologically fixed and hierarchical distinctions within humanity and a struggle for world conquest. The consequent de-humanisation of entire races and ethnic groups in favour of Aryanist ideals set the grounds for fascism, with the enthusiastic support of what Arendt termed “the mob,” that is, the resentful European déclassés. Furthermore, the narrow chauvinism of pan-Slavism coupled with notions of class warfare and annihilation paved the way for a parallel communist regime of terror in the Soviet Union.

Arendt held that in both its fascist and communist varieties, the totalitarian system’s terror is not incidental, but essential. Unlike authoritarian dictatorships that strive to uphold conservative values, such regimes by their very nature aim to destroy civil society and tradition in favour of a utopian re-fashioning of humanity to suit their collectivist ideological purposes. The twentieth century totalitarian state thus emerges as a juggernaut of terror, a terror maintained in no small part by the eradication of fundamental human values and all critical thought in favour of ideology and propaganda. It thereby seeks to destroy all communal and civil institutions between it and its atomised and lonely citizens. Arendt wrote:

The ideal subject of totalitarian rule is not the convinced Nazi or the convinced Communist, but people for whom the distinction between fact and fiction (that is, the reality of experience) and the distinction between true and false (that is, the standards of thought) no longer exist. (Arendt, 1968: 474).

A key challenge to Arendt’s analysis is shared with all such work on the frontier between political theory and intellectual history, namely its degree of empirical truthfulness and the precise accuracy of its causal explanations (Gleason, 1995). Establishing such causal connections requires the extensive use of detailed historical evidence, as well as the colligation of coexisting ideas upon which Arendt relied. So, the account is subject to the usual historiographical and logical criticisms concerning the possible gap between the causation of events and the correlation of trends.

For all of the considerable attention that The Origins of Totalitarianism attracted in 1951, it was in 1963 that Arendt was to produce one of the most controversial works ever written by a political philosopher. Eichmann in Jerusalem: A Report on the Banality of Evil did not merely generate much discussion; it produced an intellectual shock wave heard around the world that still reverberates:

Hannah Arendt’s Eichmann in Jerusalem was published fifty years ago….It’s hard to think of another work capable of setting off ferocious polemics a half century after its publication. (Lilla, 2013).

Arendt here developed and expanded her general conclusions on the Holocaust and fascist bureaucracy from a series of articles that she wrote on the Eichmann trial for The New Yorker.

She claimed that for all of his extreme evil, Eichmann was not a mysterious monster, neither in his overall demeanour nor in his political and moral psychology. His evil was as much a matter of consequences as of intent, and in fact his intentions emerged as mixed, during the trial before the Israeli court. Arendt did not claim, in her thesis of “the banality of evil” that Eichmann was entirely neutral in his managing of the Nazi’s final solution, as has been maintained. Rather, she saw him as a distinctly modern product of a totalitarian bureaucracy who at times was eager to implement Hitler’s genocide, but who also showed real tendencies towards narrow instrumental rationality, clichéd thought and speaking patterns, and superficial amorality. She was thus struck by his at times entirely average bearing and thought patterns throughout the trial, for all of the enormous evil that he perpetrated.

Furthermore, Arendt claimed that the Eichmann case confirmed her view that totalitarianism represents a gross perversion of fundamental civilised and ethical values in favour of mass bureaucracy, propaganda and thoughtlessness. Both perpetrators and victims of the Holocaust were thus corrupted through a process involving the malevolence and instrumental efficiency of the Nazis, as well as the activities of a collaborating minority in the ghetto police and Jewish Councils. This last point was to provoke particular discomfort and sheer hostility, giving Arendt a virtual pariah status, although later Holocaust historiography has placed the general problem of collaboration in a more balanced context.

For Arendt, Eichmann was as much a product of the worst possible tendencies of state bureaucracy as a creator of them. This bureaucratic context in no way exonerated him, as she was careful to indicate; she held his execution in 1962 to be justified, even though she thought that there was a strong case in international law for an international tribunal for the case, rather than the Israeli court. However, the bureaucratic framework of Eichmann’s crimes required a re-examination of what she held to be a misleading diabolical conception of evil. That there is a tension between this account and the notion of radical evil developed in The Origins of Totalitarianism seems clear. However, both works share the important common denominator of an indictment of totalitarian bureaucracies that render the unthinkable not just possible, but probable and even banal. In a very real sense, this is a more disturbing thesis than Arendt’s earlier conception of evil as radical or in no small part beyond rational explanation. If Arendt was right overall, totalitarianism is a constant threat in modern mass societies, and no complacency on the matter can be justified.

d. Erich Fromm on Escaping from Freedom: A Psychoanalytical Approach

Among the various attempts to apply psychoanalysis to the question of totalitarianism, Erich Fromm’s Escape from Freedom is conspicuous for its sustained argumentation and conceptual scope. Fromm’s thesis that there exists an “authoritarian character” was subsequently developed through empirical case studies by Theodor W. Adorno and his co-authors in their work, The Authoritarian Personality.

Fromm was a philosophically inclined sociologist, who drew from both the Freudian and Marxist traditions in elaborating an explanation of diverse social phenomena. This is apparent in his view that there exists what might termed a self-reinforcing causal mechanism between social processes and ideology, in which psycho-social factors are reinforced by belief systems, and vice versa.

Central to Fromm’s analysis is the notion that totalitarianism stems from several root causes linked to the full emergence of modern individualism in the aftermath of the Reformation. Medieval social psychology was strongly transcendental in its emphasis of the secondary character of secular authority under God, and thus it inhibited the development of the sense of loneliness and isolation that characterised Western history from about the sixteenth century onwards.

For Fromm, Protestantism stimulated the development of individualism in its stress on individual success and good works, dutiful submission to God, thrift, and a significant sphere for secular authority. A self-reinforcing causal mechanism became increasingly apparent, especially among the middle classes of modern capitalist society, as the new form of Christianity helped to create the modern individual, and was in turn strengthened by the resultant socio-economic psychology of modern European society.

However, there can be no turning back the clock according to Fromm. Rather, modern humanity must strive to encourage healthy life-affirming values and the expression of human freedom. This is best done by recognising, as a society, the values of love, spontaneity, and secure personal development.

Fromm proposes that the anxiety in isolated individuals, produced by the great burden to succeed demonstrably and without secured grace in the eyes of God, led to severe social and psycho-pathologies. In particular, collectivist ideologies, including totalitarianism, emerged to satisfy the modern individual’s need for a sense of a higher purpose or calling:

It seems that nothing is more difficult for the average man to bear than the feeling of not being identified with a larger group….The fear of isolation and the relative weakness of moral principles help any party to win the loyalty of a large sector of the population once that party has captured the power of the state. (Fromm, 1969: 234).

A chilling picture thus emerges of an inherently alienated and insecure modern society that generates mass social movements of conformity. For Fromm, this was especially true of the German lower middle class, which he held to be strongly influenced by modern individualistic ideologies. He furthermore held that this class was the most alienated class in Germany, and thereby prone to a compensatory destructiveness, and that it was strongly characterised in Weimar Germany by a sense of having lost its legitimate class status. Thus the rise of Nazism had both important psycho-social and class factors, in his view.

Fromm analyses various “mechanisms of escape” by which the alienated seek relief from the burden of individual autonomy. Prime strategies, linked to totalitarianism’s appeal, are unthinking submission to the leader, and mindless conformity. The latter trend he saw not just in totalitarian society, but in capitalist democracies as well, and as requiring concerted social activism.

Both sadism and masochism are seen by Fromm as attempts to overcome feelings of individual powerlessness and meaninglessness. In politics, the authoritarian character is characterized by a slavish and nihilistic submission to authority, and a desire to have it over others. This character type, for Fromm, is the one most easily seduced by fascism.

If Fromm was correct in this, then the root causes of totalitarianism are both internal or psychological and external, in the form of trends in class relations and ideological evolution. The threat therefore remains nascent even in seemingly highly democratic modern societies, although Fromm did not advocate a relativism that would blur the lines between imperfect democracies and dictatorships.

Fromm, like Taylor, holds that positive notions of freedom can be of constructive value in counteracting political and social distortions and pathologies. In particular, a social democratic society that provides the individual with adequate resources and a sense of autonomous personal development can do much, he held, to reduce the appeal of totalitarian ideologies and to promote mental health and social ethics:

We must replace manipulation of men by active and intelligent cooperation, and expand the principle of government of the people, by the people, for the people to the economic sphere. (Fromm, 1969: 300).

Fromm’s analysis focussed considerably more on fascism than on communism. Its political diagnosis of Nazism, in particular, has been faulted even by sympathetic critics on several counts:

Fromm did not…treat the intensity of Hitler’s anti-Semitism, choosing instead to locate the Jew with the communist and the Frenchman as examples of Hitler’s purportedly “lesser” groups. Nor did Fromm point to the discredited Social Darwinist premises behind the Nazi quest for Aryan purity…. His hypothesis about the lower middle class has not held up. The Nazis gained votes from all classes. (Friedman, 2013: 113).

In his later work, Fromm extended his classic work on human aggression and destructiveness, providing psycho-biographies of totalitarian leaders such as Hitler, Himmler, and Stalin.

3. Later Work

a. Judith Shklar’s Liberalism of Fear

Throughout her work, the American political theorist Judith Shklar stressed the importance of seeing liberalism not as a utopian or perfectionistic ideal, but rather as a bulwark against tyranny and cruelty. In effect, she claimed that liberalism ought to be defined more by its opposition to oppression and nastiness than by anything else.

Shklar traces the roots of liberalism to the struggle for religious toleration in Reformation and Baroque Europe. In her model, a progressive consensus emerged in Western thought, holding that cruelty is supremely wicked. Early figures in this development include Montaigne and Montesquieu, whom Shklar contrasted with Machiavelli on this question.

This commitment to “put cruelty first” contributed greatly to the development of liberalism’s abhorrence of dictatorships of all kinds, including those of a modern totalitarian character. This implies an affirmation of memory over hope, and of sensitivity to the horrors of oppression over utopian aspiration. Not merely property rights, cultural pluralism, and the rule of law, but anti-tyranny first and foremost define the modern liberal perspective. If liberalism is rare historically and globally, this has more to do with the widespread character of cruel delusion than with any intrinsic defect on its part. For Shklar, we ought to remember at all costs the disastrous consequences of not putting cruelty first:

We must…be suspicious of ideologies of solidarity, precisely because they are so attractive to those who find liberalism emotionally unsatisfying, and who have gone on in our century to create oppressive and cruel regimes of unparalleled horror. (Shklar, 1998: 18).

Shklar’s negative liberalism has been criticised by Michael Walzer as setting reasonable anti-totalitarian boundaries for democratic action, while not recognising the importance of moving beyond them in the interest of social progress:

We always have to be afraid of political power; that is the central liberal insight. But this is an insight into a central experience that wasn’t discovered, only theorised by liberal writers. Nor does this fear by itself make for an adequate theory of political power. We must address the uses of power as well as its dangers. And since it has many uses, we must choose among them, designing policies, like Shklar’s guaranteed employment, that enhance and strengthen what we most value in own way of life. Then we try to enforce those policies, carefully if we are wise, remembering the last time we were fearful, and acting within the limits of liberal negativity. (Walzer, 1996: 24).

If this is correct, then the strong anti-totalitarianism of the liberalism of fear should be seen as setting boundaries against tyranny, rather than final limits to progressive social policy. Positive liberty is thereby affirmed, within strong democratic boundaries.

b. Avishai Margalit on the Decent Society and Totalitarianism

In reaction to the strong emphasis upon theories of justice in late twentieth century political thought, Avishai Margalit presented his case for the “decent society.” Such a society is, first and foremost, one that does not humiliate people. This means not treating human beings as less than human, as mere machines, animals, or inanimate objects. For Margalit, even if a society is just institutionally and procedurally, it may nonetheless denigrate its citizens and subjects in diverse institutional ways, thereby rendering it formally civilized but indecent. Without denying the value of social justice and the rule of law, Margalit has claimed that philosophy and political theory long neglected decency, which is every bit as important as justice. In so doing, they could not do justice to one of the main forms of oppression: institutional and state contempt for individuals.

In The Decent Society, Margalit contrasts totalitarian and gossip societies. Both these types of society are, for Margalit, indecent in not respecting individuals and their own legitimate social space. Gossip societies allow for a considerable range of imperfection, but lack decency in their absence of respect for privacy, and their non-institutional or cultural humiliation of alleged non-conformists.

In their radical perfectionism, totalitarian societies have no respect for individual privacy, and they systematically and institutionally obliterate communal and family structure between the individual and the state. Such societies’ regimes do everything within their considerable power to humiliate their subjects so as ultimately to perfect them, by recognising no legitimate private space, and by gathering sensitive information with which to blackmail and control them. They are thus agents of ultimate indecency, for Margalit.

Friendship among anti-totalitarian dissidents is thus especially valuable and intense, because of the potentially life and death solidarity that is generated by opposition to supreme state and bureaucratic indecency. The violation of such friendships by forcing dissidents to reveal sensitive information about others to the state is, for Margalit, one of the worst aspects of totalitarianism:

Totalitarian societies have proved to be a prescription for and guarantor of brave friendship, since friendships in regimes of this sort are conspiracies of humanity against the inhumanity of the regime. (Margalit, 1996: 210).

Margalit’s analysis provoked some re-examination of political and social philosophy’s focus on justice. In particular, the general core question of the balance to be struck between decency and justice raises fundamental questions about value priority:

…one might take the view that the best way for a society to strive to become decent is by promoting justice. By treating people in accordance with justice, society denies them one sound reason to feel rejected from humanity, however much they may actually feel that way….the decent and the just society may be too closely intertwined for us to be able to say that one or other has clear priority as an ideal. (Patten, 2001: 231).

Patten’s question reminds us of the extent to which justice and rights have a fundamental role in social and political values. It should be clear that Margalit in no way wishes to deny the value of justice. However, one may recall here Arendt’s thesis to the effect that totalitarianism arose in part not only because of an indecent Social Darwinism, but due to the repudiation of universal human rights. This may well be a strong challenge to attempts to reduce the firm priority of justice in political life.

4. References and Further Reading

Adorno, Theodor W., Frenkel-Brunswik, Else, Levinson, Daniel J., and Sanford, R. Nevitt. The Authoritarian Personality. Harper & Brothers, New York, 1950.
Arendt, Hannah. Eichmann in Jerusalem: A Report of the Banality of Evil. Penguin, New York, 2006.
Arendt, Hannah. The Origins of Totalitarianism. Harvest Books, Harcourt Brace Janovitch, San Diego, New York and London, 1968.
Arendt, Hannah. Between Past and Future: Six Exercises in Political Thought. Meridian Books, the World Publishing Company, Cleveland and New York, 1963. (Contains “What is Authority?”)
Berlin, Isaiah. The Crooked Timber of Humanity: Chapters in the History of Ideas. Edited by Henry Hardy. Pimlico, London, 2003a. (Contains “The Pursuit of the Ideal,” and “The Decline of Utopian Ideals in the West.”)
Berlin, Isaiah. Freedom and its Betrayal: Six Enemies of Human Liberty. Hardy, Henry (editor). Pimlico, London, 2003b.
Berlin, Isaiah. Liberty. Hardy, Henry (editor). Oxford University Press, Oxford and New York, 2002. (Contains several key essays in the section “Five Essays on Liberty.”)
Berlin, Isaiah. The Sense of Reality. Hardy, Henry (editor). Chatto and Windus, London, 1996. (Contains important essays such as “”The Sense of Reality,” “Political Judgement,” and “Philosophy and Government Repression.”)
Bossuet, Jacques. Politics Drawn from the Very Words of Holy Scripture. Cambridge University Press, Cambridge, 1999.
Cotter, Matthew J. (editor). Sidney Hook Reconsidered. Prometheus Books, Amherst, NY, 2004.
Dewey, John. The Political Writings. Morris, Deborah and Shapiro, Ian (editors). Hackett, Indianapolis and Cambridge, 1993. (Contains “I Believe”)
Friedman, Lawrence J. The Lives of Erich Fromm: Love’s Prophet. Columbia University Press, New York, 2013.
Friedrich, Carl J. and Brzezinski, Zbigniew K. Totalitarian Dictatorship and Autocracy. Harvard University Press, Cambridge, 1956.
Fromm, Erich. The Anatomy of Human Destructiveness. Holt, Rinehart and Winston, New York, 1973.
Fromm, Erich. Escape from Freedom (Fear of Freedom in the UK). Avon Books, New York, 1969.
Fromm, Erich. Man for Himself. Fawcett Premier, Greenwich, Connecticut, 1967.
Gentile, Giovanni. Origins and Doctrine of Fascism with Selections from Other Works. Edited by A. James Gregor. Transaction Publishers, Edison, NJ, 2003.
Gleason, Abbott. Totalitarianism: The Inner History of the Cold War. Oxford University Press, Oxford and New York, 1995.
Hobbes, Thomas. Leviathan. Wordsworth Editions, Hertfordshire, 2014.
Hoffmann, Stanley (editor). Political Thought and Political Thinkers. University of Chicago Press, Chicago and London, 1998.
Hook, Sidney. Heresy, Yes—Conspiracy, No. Greenwood Press, Publishers, Westport, Connecticut, 1953.
Hook Sidney. “Heresy, Yes—But Conspiracy, No.” New York Times Magazine, July 9, 1950. Available online through Dissent Archives.
Kołakowski, Leszek. Modernity on Endless Trial. University of Chicago Press, Chicago, 1990.
Konvitz, Milton R. and Kennedy, Gail (editors). The American Pragmatists. Meridian Books, UK, 1960.
Lenin, Vladimir. The State and Revolution. Penguin Books, Middlesex, 2009.
Lilla, Mark. “Arendt and Eichmann: The New Truth.” New York Review of Books, November 21, 2013. Online version.
Litwack, Eric B. “Erratum to: Epistemic Arguments against Dictatorships.” Human Affairs 21, 2011, pp. 226-235,.
Macpherson, C. B. “Review of The Origins of Totalitarian Democracy.” Past and Present, Number 2, November 1952, pp. 55-57.
Patten, Alan. “Review of The Decent Society.” Mind, Volume 110, Number 437, January 2001, pp. 229-232.
Plato. Republic. Oxford University Press, Oxford, 1993.
Popper, Karl. The Open Society and its Enemies. Routledge Classics, Oxford, 2011.
Popper, Karl. The Poverty of Historicism. Routledge, London, 1960.
Putnam, Hilary. Renewing Philosophy. Harvard University Press, Cambridge and London, 1992.
Schmitt, Carl. Dictatorship. Polity Press, Cambridge, 2013.
Schmitt, Carl. The Concept of the Political. University of Chicago Press, Chicago, 2007.
Shklar, Judith N. “The Liberalism of Fear,” in Political Thought and Political Thinkers. Hoffman, Stanley (editor). University of Chicago Press, Chicago and London, 1998.
Shklar, Judith N. “Putting Cruelty First.” Daedalus Volume 111, Number 3, Summer, 1982, pp. 17-27.
Talisse, Robert B. “Politics without Dogmas: Hook’s Basic Ideals.” In Cotter 2004, pp. 117-128.
Talmon, J. L. The Origins of Totalitarian Democracy. Penguin Books, Harmondsworth, UK and New York, 1986.
Taylor, Charles. Philosophy and the Human Sciences. Philosophical Papers 2. Cambridge University Press, Cambridge and New York, 1985. (Contains “What’s Wrong with Negative Liberty?”)
Walzer, Michael. “On Negative Politics.” Yack, Bernard (editor) (1996), pp. 17-24.
Westbrook, Robert B. Democratic Hope: Pragmatism and the Politics of Truth. Cornell University Press, Ithaca and London, 2005.
Yack, Bernard (editor). Liberalism without Illusions: Essays on Liberal Theory and the Political Vision of Judith N. Shklar. University of Chicago Press, Chicago and London, 1996.

Author Information

Eric B. Litwack
Email: e_litwack@bisc.queensu.ac.uk
Queen’s University and Syracuse University in London
United Kingdom

Thomas Reid: Philosophy of Mind

This article focuses on the philosophy of mind of Thomas Reid (1710-1796), as presented in An Inquiry into the Human Mind on the Principles of Common Sense (1764) and Essays on the Intellectual Powers of Man (1785). Reid’s action theory and his views on what makes humans morally worthy agents, although connected to philosophy of mind, are not explored here.

Reid is best known as the father of common sense philosophy. He contends that going back to the principles of common sense will help deal with the problems engendered by the so-called “skeptical views” of his predecessors: Descartes, Locke, Berkeley, and Hume. He argues that “the way of ideas” generates undue uncertainty in the theory of knowledge. If the only things that can be known directly and immediately are the contents of one’s mind, there can be no certainty in the knowledge geared toward the external world. Reid believes this goes against the common-sense view that humans do acquire certain knowledge through empirical observation of the external world, and are therefore not confined to know only the contents of their minds.

In philosophy of mind, Reid is most celebrated today for the arguments he gave in support of the position known as direct realism, which, at its most basic, states that the primary objects of sense perception are physical objects, not ideas in human minds. However, Reid’s philosophy of mind neither begins nor ends with perception. In addition to arguing for direct realism and, consequently, against “the way of ideas,” he undertook the task of establishing the equal status of the faculties of the mind, and of explaining the relationships that exist among them. He is a worthy successor of Locke, in that he believes that the mind is to be characterized in terms of a faculty psychology. He is a worthy successor of Newton, in that he believes that the scientific method is the right way of investigating the nature of mind. Reid characterized the scientific method mainly by trial and error, and by setting up experiments and drawing general conclusions from them.

One of the starting points of Reid’s philosophy of mind is a traditional distinction between the “powers of the understanding” and the “powers of the will.” Reid believes this distinction is not entirely correct because the mind is active whenever the powers of the understanding are exercised, and a certain degree of understanding is needed for any act of will. However, he uses it to classify the faculties of the mind into intellectual, on the one hand, and active, on the other. The distinction is used in the titles of his two mature published works: Essays on the Intellectual Power of Man (1785) and Essays on the Active Powers of Man (1788), which he envisioned as two sides of the same coin. Reid thought that any theory of the mind should comprise an investigation into both types of mental operations.

Sensation
Perception
1. Original Perception
2. Acquired Perception
Memory
1. General Considerations on Memory
2. Memory and Personal Identity
Intellectual Powers (Proper)
1. Conception
  1. Bare Conception
  2. Imagination
2. Judgment and Reasoning
Taste
1. Why This Faculty Is Called “Internal Taste”
2. An Objectivist Account of Beauty
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. Sensation

Reid argues that sensation is an original and simple operation of the mind, which for him means not only that certain beings (namely sentient ones) are born with an ability to sense, but also that this operation of the mind cannot be logically defined. All natural operations of the mind are simple and, in some sense, primitive, so that no reductive definition can be offered. This does not mean, however, that one cannot pay attention to the specific role played by this operation. In doing so, one will discover its most important features.

Although careful introspective observation will reveal that sensations do not usually occur on their own, but are almost always accompanied by perceptions, Reid is pointing out that a clear-cut distinction between sensation and perception exists and should be accounted for. This distinction has to do primarily with the specific roles sensations and perceptions play in the knowledge of the external world. Sensations are of limited use, in this sense; they only give information of what goes on in the sentient being. Perceptions, on the other hand, contribute to basic repository knowledge. In sensing a smell or tasting a taste, for instance, a sentient being will take notice of how its mind is affected, but, as Reid points out, such sensations bear no resemblance to any of the qualities of the external objects that cause these sensations to occur in the sentient being. Here Reid differs from his predecessors: according to John Locke, for instance, at least some sensations (those derived from the primary qualities of objects) do resemble the external objects which occasion the formation of such simple ideas in sentient beings such as humans (Locke, Essay II. viii. 15). To make the distinction with perception more vivid, Reid discusses an example: in seeing a flower or touching a sugar cube—which involves perceiving and having contentful thoughts about these objects, as is elaborated in the next section—humans gain knowledge about what these external objects really are. There still is no resemblance thesis advanced, to be sure; the mind is simply projected outside itself and, in doing so, it objectifies the things in its environment. In this, Reid is very forward-thinking: he is the first philosopher to draw a distinction between sensation and perception, which is extensively employed in contemporary philosophy of mind and psychology (as J. J. Gibson rightfully noticed).

This distinction between sensation and perception rests primarily on a peculiarity of the faculty of sensation: Reid believes that this is the only operation of the mind that “hath no object distinct from the act itself” (EIP I. 1, 36). He acknowledges the fact that human language is misleading in this respect: for instance, for both sensation and perception, people use “the same mode of expression” (IHM 6.20, 167). This mode of expression involves an active verb and an object: one can say both that “I feel a pain” and that “I see a tree” (IHM 6.20, 167). But, Reid contends, in the former case the object itself is grammatical only, and not also real, whereas in the latter the object is a real thing, allegedly existing outside the perceiver’s mind.

It is less clear what Reid means when he says that the object is not real, but grammatical only, in the case of the construction expressing a sensation that one may feel. There are two ways of interpreting this claim, and this ambiguity tracks two distinct positions in the secondary literature on Reid. On the one hand, sensations, for Reid, can be understood to not have objects at all: as such, this mental operation is distinct from all others. If we understand sensation to have no object, to be about nothing, it cannot ever be wrong. This would mark sensation as a very special faculty among the faculties of the human mind; perception or memory are not like this: someone can misperceive a tree just as well as he can misremember having seen a tree. But a person can never be mistaken about a feeling that particular person has: whenever someone has a headache, that ache is real and it is that person’s and it is exactly as that person is feeling it. On the other hand, that passage has been read as saying that sensations take themselves as objects; Reid, in this interpretation, would subscribe to a reflexive view of sensations. Just like perceptions and memories, sensations are constituted by two other ingredients: a conception of the object, and a belief that the object exists, except, in the case of sensation, this object is the sensation itself, not an external object like trees, frogs, or human beings.

A consequence of understanding Reid as saying that sensations do not have any kind of objects is to think that he is a precursor of “adverbial” theories of sensation. In this account, a sentient being is not said to have a sensation of a red object, but to sense in a certain way whenever stimulated in the right manner. Sensations inform the sentient being of various ways of feeling: there is a particular way of feeling redly, as opposed to a particular way of feeling yellowly, and there is yet another way of feeling headachely (see also Sense-Data). Understanding that sensations provide us with a qualitative feel and making sense of what exactly this means has become very important in early 21^st century discussions on the nature of mind and consciousness. According to some authors, such as David Chalmers, Frank Jackson, Joseph Levine, and Thomas Nagel, qualia offer sufficient proof that a complete reduction of all mental processes to purely physical processes (as described by a physicalist interpretation of brain processes) is impossible (for more, see Qualia). So, understanding Reid’s position in this manner will place him squarely in the same tradition as one of the most important debates in contemporary philosophy of mind.

The last attribute of sensations worth mentioning is their role as signs of external objects. Usually, sensations pass unnoticed (unless the sentient being carefully attends to them) to other things that they signify. This feature of sensations allows Reid to argue that they are never to be associated to Lockean ideas (Locke, Essay II. viii. 8): they are not the objects of perception, and, moreover, they are not mental intermediaries between the mind and the world. Perception of external objects turns out to be immediate, in Reid’s view (Reid on sensations as signs: IHM 2. 10, 43; IHM 4. 1, 49; IHM 6. 21, 177). To properly understand the role of sensations as signs of external objects, according to Reid, an analysis of perception should be given, a task undertaken in the next section.

2. Perception

Perception is the main faculty that has the role to give beings endowed with this faculty brute knowledge about the external world: the knowledge is brute because no reasoning enters perception; and the result is knowledge, even though sometimes when the perceiver believes that something is being perceived, something is actually being either perceptually illusioned or hallucinated. However, even when a perceptual state results in a false outcome, the state itself should be characterized as perception (for more on how and why perception can be non-veridical, see EIP II. 22, 241–252). So, this is how sensations, as signs of external things, work to connect minds with external things. Reid argues that:

[A] requisite to our knowing things by signs is, that the appearance of the sign to the mind, be followed by the conception and belief of the thing signified. Without this the sign is not understood or interpreted; and therefore is no sign to us. […] Now, there are three ways in which the mind passes from the appearance of a natural sign to the conception and belief of the thing signified; by original principles of our constitution, by custom, and by reasoning. (IHM 6. 21, 177)

This passage is important in several respects: (i) it gives Reid’s “official” characterization of perception, and (ii) it lays the foundation for an important distinction at the level of perception. These two aspects are discussed in turn.

First, Reid argues that “the appearance of the sign” is followed by a conception and belief of the thing signified. When Reid gives his official characterization of perception he states that this faculty involves several others: the occurrence of a sensation suggests a conception and a belief of the existence of the thing perceived. Moreover, this existential belief is immediate, and not the product of reasoning (EIP II. 6, 96). If it were the product of reasoning, “the greatest part of men would be destitute of [the information had of external objects]; for the greater part of men hardly ever learn to reason; and in infancy and childhood no man can reason” (EIP II. 6, 101). Perception, therefore, must be able to occur independently from any act of reasoning.

The second feature of perception that the passage quoted above refers to is the distinction Reid draws between original perception and acquired perception: in the case of original perception, a natural sign (that is, a sensation) suggests a conception and a belief “by original principles of our constitution.” In the case of acquired perception, by contrast, the natural sign in question suggests a conception and a belief “by custom,” which most probably means “habit” and/or “experience.” Let us take a closer look at this distinction by pinning down some of the essential features of original perception, and by emphasizing some of the points of departures from this model, in the case of acquired perception.

a. Original Perception

According to Reid (IHM 6. 20, 171 and EIP II. 21), only two of the senses give beings endowed with them original perceptions, namely those of touch and sight. The sense of sight is somewhat problematic in this respect, though, since vision does not provide creatures endowed with it with original visual perceptions of some things, for instance depth, but only with acquired ones. In original tactile perception, the sensation had of the so-called “primary qualities of bodies” immediately suggests a conception and belief of the existence of these qualities, and of substances in which such qualities inhere. In original visual perception, the sensations of colors suggest conceptions and beliefs of the existence of the so-called “secondary quality” of color as existing outside of minds, in an external object. The perception of visible figure is also supposed to be original, according to Reid and, according to the standard interpretation of Reid, it is not accompanied by any type of visual sensation whatsoever. Why does Reid think that only two of the senses—touch and vision—can give beings that have them original perceptions? Why cannot smell, taste, and hearing provide such beings with original perceptions? Can this have anything to do with the distinction between primary and secondary qualities of objects? This is a good place to offer some details on Reid’s view of the distinction between primary and secondary qualities of objects. As previously mentioned, Reid thinks that Locke was wrong to believe that there is some resemblance between primary qualities of objects and the ideas or sensations sentient beings have of them. However, Reid himself draws a distinction between these two types of properties of objects:

There appears to be a real foundation for the distinction, and it is this: That our senses give us a direct and distinct notion of the primary qualities, and inform us what they are in themselves: But of the secondary qualities, our senses give us only a relative and obscure notion. They inform us only, that they are qualities that affect us in a certain manner, that is, produce in us a certain sensation; but as to what they are in themselves, our senses leave us in the dark. ([emphasis added]; EIP II. 17, 201)

Reid argues that knowledge of primary qualities—like squareness, or hardness, or motion—is direct: it captures everything there is to know about such a quality. Squareness, hardness, motion, and all the other mathematical qualities of bodies are known intrinsically. The conception human beings have of secondary qualities, like color, for instance, is not like this; hence it does not constitute knowledge. All there is to know about a secondary quality is that sentient beings are constituted in such a way that whenever a normal being is in contact with the color red, under normal conditions, that being gets a sensation, which is different in what it feels like to that being from the sensation that same being gets whenever it is stimulated with the color yellow, under normal conditions. Other examples of primary qualities of bodies include shape, size, and solidity. Besides color, other examples of secondary qualities are heat, cold, smell, and taste..

This distinction is important for understanding Reid’s view of original perception, since one way of drawing this distinction is by reference to what kinds of things can be originally perceived, as opposed to what kinds of things can be perceived only in an acquired manner. It might seem that the distinction between original and acquired perception is essentially linked with the more traditional one between primary and secondary qualities of bodies. This is indeed what several scholars have argued, citing as main evidence for this interpretation the fact that human beings have direct conceptions only of primary qualities. Based on this type of conception, human beings gain knowledge only of primary qualities and, if perception is supposed to give perceivers knowledge, as Reid thinks, it seems clear that perceivers can perceive only primary qualities of bodies, since perceivers do not gain any knowledge, by their senses, of secondary qualities. This argument seems correct, but it has a severe uphill battle because Reid specifically and consistently places color, a secondary quality, on the list of things that can be originally perceived (IHM 6. 20 p. 171; EIP II. 21, p. 236). So, if we are to listen to Reid, the distinction between primary and secondary qualities, on the one hand, and the distinction between original and acquired perception, on the other, do not carve the world in the same way. The distinction between original and acquired perception, therefore, must be clarified in a different way.

b. Acquired Perception

Acquired perception is distinguished from original perception primarily by the role of learning and experience. There is no need for any type of experience, according to Reid, for human beings to be able to perceive the primary qualities of bodies and the bodies themselves by touching them, for instance. However, one must learn to associate a certain sign that conjures up an original perception or a sensation only, with a certain external object. There is a controversy in the literature concerning what exactly this learning involves: according to some authors (for example, Van Cleve (2004)), it initially involves inference or reasoning, thus excluding anything that we acquire in this way from the list of things that we actually perceive, since perception, for Reid is a faculty that does not rest on the perceiver’s reasoning powers, as indicated in the previous section (EIP II. 6, 101). According to other authors (such as Copenhaver (2010)), however, acquired perception never involves any type of reasoning. Rather, Reid intended acquired perception to be understood as a distinctively perceptual ability: with the passage of time, normal perceivers acquire more perceptual sensitivity to properties not represented in original perception. Here is Reid explaining how this happens in the case of perception of depth and three-dimensional figure by sight:

It is experience that teaches me that the variation of colour is an effect of spherical convexity […]. But so rapid is the progress of the thought, from the effect to the cause, that we attend only to the last, and can hardly be persuaded that we do not immediately see the three dimensions of the sphere. (EIP II.21, 236)

The fact that this type of ability is called “acquired” should not suggest that it is less natural than the original variety. Beings endowed with the ability to develop acquired perception do not develop this ability consciously or only because they decide to acquire certain perceptions. Here is what Reid says concerning this:

In acquired perception, the signs are either sensations, or things which we perceive by means of sensations. The connection between the sign and the thing signified, is established by nature: and we discover this connection by experience; but not without the aid of our original perceptions, or of those which we have already acquired. After this connection is discovered, the sign, in like manner as in original perception, always suggests the thing signified, and creates the belief of it. (IHM 6. 24, 191)

Acquired perception thus builds upon the original abilities of sensing and originally perceiving things in nature that human beings have. In acquired perception, in contrast to original perception, the conventional associations between signs and things signified are introduced by a combination between nature and experience. In original perception, these conventions are the result of nature alone: this is the way humans are constituted. Reid believes that acquired perceptions are far more numerous than original perceptions (EIP XXI. 21, 235).

3. Memory

a. General Considerations on Memory

Memory, for Reid, is the perfect counterpart to perception: it is an original faculty of minds, which is meant to give beings endowed with it immediate access to the past. He argues that it is a first principle of common sense “[t]hat those things did really happen which I distinctly remember” ([emphasis added]; EIP VI. 5, p. 474) and that the knowledge that memory gives is “immediate knowledge of things past” (EIP II. 1, p. 253). No mental entities, such as ideas, mediate a being’s access to the external world in memory, just like no such entities mediate such a being’s access to the world in perception. There are three things involved in perception, and, similarly, there are three things involved in memory: a mind, a faculty, and an external object, which the mind gains knowledge of via the faculty in question. For Reid, “[m]emory implies a conception and belief of past duration” (EIP III. 1, p. 254). This formulation mirrors the one that he gave to further explain how perception operates, although both in the case of memory and of perception, these explanations are not definitions, since both these faculties are simple, and hence cannot be reductively defined by analyzing their components. The external object, in the case of perception, is (allegedly) presently existing; the external object, in the case of memory, was (allegedly) existing in the past of the mind having the memory in question. Beings endowed with perception can be said to mis-perceive things—which are either different than they appear to be or do not exist altogether; and beings endowed with memory can be said to mis-remember things—which were either different than they appeared to such beings or did not exist altogether. To present his own views on memory, Reid starts by first criticizing his precursors, primarily Locke and Hume, for operating with a so-called “store-house” model of memory. Contrary to what he takes Locke and Hume to be saying, memory is not a repository for ideas, which can be revived, whenever the person who had those ideas needs them again (for example, Locke, Essay II.xx.2). The main problem here, according to Reid, is that if an idea could indeed be revived in this way, that idea would be perceived again, and not actually remembered. This is because, as Reid understands them, Locke and Hume argue that ideas are the immediate objects of perception. So, whenever an idea is present to the mind—whether for the first time or when it is revived—the mind should be said to perceive it. What does memory contribute here, Reid asks? Even though Reid is not the most charitable interpreter of Locke or of Hume, some of the criticisms he raises are cogent. There is a threat of circularity in the account of memory offered by both Locke and Hume, as Reid understands them. Both Locke and Hume’s accounts of memory seem to presuppose memory, rather than explain it: the ability to understand that a certain idea that is now present to the mind is exactly the same, qualitatively, and not numerically (since both Locke and Hume believe that ideas are fleeting), as an idea that was present to the mind at a previous moment of time, needs memory. The problem is that no idea contains any information, qualitative or representational, that could be used to identify that idea as being about the past.

So, what is Reid’s positive account of memory? Here is what he says at the beginning of the Essay on memory:

Things remembered must be things formerly perceived or known. I remember the transit of Venus over the sun in the year 1769. I must therefore have perceived it at the time it happened, otherwise I could not now remember it. Our first acquaintance with any object of thought cannot be by remembrance. Memory can only produce continuance or renewal of a former acquaintance with the thing remembered. (EIP III. 1, p. 253-55)

This suggests that Reid is operating with a precursor of a distinction used in the psychological literature of the twentieth century, as advanced primarily by Tulving (1983). According to Tulving (1983) there are two main types of long-term memory: procedural—whereby one remembers how to perform certain actions (for instance, one remembers how to ride a bike or how to bake a cake), and declarative. This latter type is itself divided into episodic memory—whereby one remembers an experience that one underwent or an event one witnessed (for example, somebody remembers running in her first 5K race); and semantic memory—whereby one remembers that so-and-so is the case, where the fact remembered may be something that happened before one’s time (such as when one remembers that Napoleon was defeated at Waterloo). Semantic memory is further distinguished from the episodic kind by the so-called “previous awareness condition” on episodic memory, which requires for someone to have been there in a capacity of witness or agent of an event, for that event to be episodically remembered. Reid thinks that something like the previous awareness condition on episodic memory must be satisfied in cases like the one quoted above: for someone to remember something, that person must have perceived that thing at an earlier moment of time.

There is a debate among Reid scholars concerning this very issue: did Reid think that all memory should be understood as episodic, or did he have room in his theory for semantic memory as well? Some authors believe that, for Reid, all memory is episodic (for instance, Van Woudenberg (1999)); others believe that Reid was concerned with both semantic and episodic memory (such as Copenhaver (2009)). The consensus in the literature is, however, that Reid had nothing very interesting to say about procedural memory. This is important since it shows Reid to be very forward-thinking in his treatment of memory: he believes that episodic memory is fundamental for a being’s immediate knowledge of its past.

So, how does memory connect a being endowed with such a faculty with past events? According to Reid, memory does not offer a being endowed with this faculty a present connection with an event experienced in the past. The access to past events is not by re-acquaintance, as Locke or Hume would say. The past acquaintance of the event itself is preserved through the conception and belief deployed in a memorial experience. This is because, according to Reid, apprehension, when employed by another faculty, such as perception and consciousness, is strictly related to the present moment:

It is by memory that we have an immediate knowledge of things past: The senses give us information of things only as they exist in the present moment; and this information, if it were not preserved by memory, would vanish instantly, and leave us as ignorant as if it had never been. (EIP III. 1, p. 253)

b. Memory and Personal Identity

Reid is famous for his criticism of Locke’s theory of personal identity. The success of this criticism depends on the explanation of the relationship that perception and consciousness, on the one hand, and memory, on the other, have with time. Perception and consciousness give a being endowed with such faculties immediate knowledge of presently existing things: of how the external world is, and of how the mental operations of the minds of such beings succeed one another, respectively. Memory, on the other hand, gives beings endowed with this faculty immediate knowledge of things past; and these things can be, in turn, external or internal. Someone can remember, for instance, having a certain nauseating sensation upon encountering some rotten food. That person will not only remember the state of the food, in this case, but also his having a certain unpleasant sensation.

Reid finds Locke’s theory of personal identity lacking on two counts: (i) first, Locke suggests that consciousness can extend to the past (Essay II.xxvii.9); (ii) second, Reid thinks that Locke is claiming that personal identity consists in memory—sometimes this theory of personal identity is called “the memory theory of personal identity.” The two issues are related, and the first one might very well be terminological: what Locke meant by “consciousness,” in this context, Reid means by “memory”:

Mr Locke attributes to consciousness the conviction we have of our past actions, as if a man may now be conscious of what he did twenty years ago. It is impossible to understand the meaning of this, unless by consciousness is meant memory, the only faculty by which we have an immediate knowledge of our past actions. (EIP III. 6, p. 277)

The second issue is more serious. The problem has to do with the fact that Locke seems to require sameness of memory for sameness of person. The type of memory involved here is episodic memory, and this might be why Locke thinks that consciousness is something that is needed here: in order to remember something about oneself episodically, a person must remember the event “from the inside.” For instance, if someone remembers, episodically, having run a 5K race this past Sunday, that person cannot be mistaken regarding who was the agent of the act of running in the race. That particular person also could not be mistaken about what it felt like to run a 5K this past Sunday. These are all characteristics of episodic memory. Furthermore, if that particular person cannot be mistaken with regard to who was the agent of this act of running (namely that person himself), then that particular person must have existed this past Sunday, at the time of the race. In thinking that memory is necessary for personal identity, Locke doesn’t seem to commit a grave error of reasoning.

Reid, however, argues that this account is absurd, because it leads to absurd consequences. To show that he is right, Reid discusses the now famous case of the brave officer:

Suppose a brave officer to have been flogged when a boy at school, for robbing an orchard, to have taken a standard from the enemy in his first campaign, and to have been made a general in advanced life: Suppose also, which must be admitted to be possible, that when made a general he was conscious of his taking the standard, but had absolutely lost the consciousness of his flogging.

These things being supposed, it follows, from Mr. LOCKE’s doctrine, that he who was flogged at school is the same person who took the standard, and that he who took the standard is the same person who was made a general. When it follows, if there be any truth in logic, that the general is the same person with him who was flogged at school. But the general’s consciousness does not reach so far back as his flogging, therefore, according to Mr. LOCKE’s doctrine, he is not the person who was flogged. Therefore the general is, and at the same time is not the same person as him who was flogged at school. (EIP III. 6, p. 276)

This case, which builds upon an objection raised by Joseph Butler (1736), is supposed to show that personal identity, understood as consisting in memory, is not a consistent notion. Here is why: due to the transitivity of numerical identity, the old general should be numerically identical with the kid who was flogged for robbing an orchard. This should follow, on the assumption that the kid who was flogged is numerically the same as the brave officer, who, in turn, is supposed to be numerically the same as the old general. Memory ensures that the boy who was flogged is the same as the brave officer, since the brave officer remembers that incident from his childhood. It ensures, moreover, that the general is the same as the brave officer, since the general remembers (episodically) that event from his youth. But, on Locke’s theory of identity, Reid claims, the general is not the same person as the kid robbing the orchard, since the general does not remember (episodically) that event from his childhood. There are two possibilities: (i) either to explain personal identity without making recourse to numerical identity, since transitivity holds for numerical identity, but this example shows that transitivity fails for personal identity. Or, (ii) to give up Locke’s theory of personal identity, since any theory that does not respect the rules of logic is irremediably flawed.

Reid chooses (ii) and argues that memory is neither necessary nor sufficient for personal identity. Memory is not a necessary condition for personhood, since during their lives, human beings witness or are the agents of many events of which they have no recollection at later moments of time. However, it would be absurd to claim that just because someone doesn’t remember something having happened, that person wasn’t actually there. Here is what Reid says on the issue: “I may have other good evidence of things which befell me, and which I do not remember: I know who bare me, and suckled me, but I do not remember these events” (EIP III. 4, p. 264). Neither is memory a sufficient condition for personal identity, according to Reid, since even though someone may be able to remember episodically that he was the agent or the witness of an event, it is not his remembering the event that makes it the case that he himself is the same person he was then. “It may here be observed […] that it is not my remembering any action of mine that makes me to be the person who did it. This remembrance makes me to know assuredly that I did it; but I might have done it, though I did not remember it” (EIP III. 4, p. 265). Memory gives someone immediate knowledge of a past event that person was the witness to or agent of, but it does not ensure that that person was actually there at the time of the event.

Reid’s theory of personal identity is deflationary: he argues that this notion is primitive. The only way to understand more about this relation is by contrast to other relations: “I can say that diversity is a contrary notion, and that similitude and dissimilitude are another couple of contrary relations, which every man easily distinguishes in his conception from identity and diversity” (EIP III. 4, p. 263). Just like Locke before him, Reid acknowledges that identity, in general (thus including the special case of personal identity), presupposes “an uninterrupted continuance of existence” (EIP III. 4, p. 263). Due to this feature of identity, there is no way to think that mental states and processes remain identical over time:

Hence we may infer, that identity cannot, in its proper sense, be applied to our pains, our pleasures, our thoughts, or any operation of our minds. The pain felt this day is not the same individual pain which I felt yesterday, though they may be similar in kind and degree, and have the same cause. The same may be said of every feeling, and of every operation of the mind: They are all successive in their nature like time itself, no two moments of which can be the same moment. (EIP III. 4, p. 263)

Thus, Reid thinks that persons should not be identified with their thoughts or feelings, but with the subject of such thoughts and feelings, which remains the same over time. This subject is an immaterial substance, a soul, which is best understood by reference to Leibniz’s notion of a monad (EIP III. 4, p. 264).

4. Intellectual Powers (Proper)

a. Conception

The fourth Essay is dedicated to conception, whose primary role is to be an ingredient (or concomitant) in all other operations of the mind. In this picture, conception is being used as part of the endeavor to gain knowledge of the external world (when it is employed by the senses), of the internal world (when it is employed by consciousness), and also to analyze the complex relationships that exist among the objects of the world, among numbers in mathematics, and among rules of reasoning in logic. As such, conception is a faculty that acts as a bridge, connecting the information gathered by the senses with the intellectual processing powers of judgment and reasoning.

Since conception is a simple operation of the mind, it cannot be subjected to a reductive definition any more than the other operations can be. However, as always, Reid argues that it has certain features which are useful to know in order to better understand how it functions, both when it is an ingredient or concomitant of other operations, and when it is employed on its own, as “bare conception.”

Reid argues that conception is an ingredient in all of the other operations of the human mind:

Our senses cannot give us the belief of any object, without giving some conception of it at the same time: No man can either remember or reason about things of which he hath no conception: When we will to exert any of our active powers, there must be some conception what we will to do: There can be no desire or aversion, love nor hatred, without some conception of the object: We cannot feel pain without conceiving it, though we can conceive it without feeling it. These things are self-evident. (EIP IV. 1, 296)

As already pointed out, the argument that sensations must be intentional, and hence take themselves as objects, is based on this idea that every operation of the mind has conception as an ingredient. The passage quoted above can indeed be read as saying that one must conceive of the pain one is feeling at a given moment of time in order to actually be able to feel it. However, it is controversial in Reid scholarship what exactly “conception” is supposed to mean in this context, despite its name. The issue concerns the fact that Reid believes that human beings share most of their perceptual and sensory abilities with lower-level animals and with human infants, who do not have a well-developed conceptual framework; thus, some authors argue that “conception” should not be taken to mean that unless one is able to have and deploy fully formed concepts, one will not be able to feel pain, for instance. In this interpretation, conception should be understood as the operation that allows beings endowed with this faculty to get acquainted with an object, be that object something that exists in the present, existed in the past, or will never exist.

On the other side of this controversy are those authors who point out that it is rather counter-intuitive to believe that conception does not operate via concepts—after all, the name might be indicative of something here. The role of conception, as an ingredient in all the other operations of the human mind, is to allow humans to secure a mental grip on something. Such a mental grip is secured by deploying a singular concept, understood to be something like a uniquely identifying definite description. In this interpretation, a being would not be able to have a sensation, a perception, or a memory unless it was able to deploy a singular concept, a uniquely identifying definite description isolating that thing in the world.

i. Bare Conception

Reid calls conception, as employed on its own, and not as an ingredient in any of the other operations of the human mind, “bare conception.” This suggests that when employed on its own, conception has a different role than when employed by a faculty of the mind in which it enters as an ingredient: “yet it may be found naked, detached from all others, and then it is called simple apprehension, or the bare conception of a thing” (EIP IV. 1, p. 286).

One of the most interesting features of bare conception is its ability to be used to think about objects without any heed being paid to their existence or non-existence, and also about propositions, without any judgment of their truth or falsity.

In bare conception there can neither be truth nor falsehood, because it neither affirms nor denies. Every judgment, and every proposition by which judgment is expressed, must be true or false; and the qualities of true and false, in their proper sense, can belong to nothing but to judgments, or to propositions which express judgment. In the bare conception of a thing there is no judgment, opinion, or belief included, and therefore it cannot be either true or false. (EIP IV. 1, p. 296)

Conception, in this sense, is that faculty allowing human beings to grasp the meaning of a proposition, which is the prerequisite for being able to judge a certain proposition as true or false: “it is one thing to conceive the meaning of a proposition; it is another to judge it to be true or false” (EIP I. 1, p. 25). Things are being conceived by beings endowed with this faculty in the following manner: an object is brought before the mind, with the help of conception: “I conceive an Egyptian pyramid. […] the thing conceived may be no proposition, but a simple term only, as a pyramid, an obelisk” (EIP I. 1, p. 25). Bare conception seems to require the mind of the conceiver to use certain concepts—simple terms—to bring forth objects to the mind in a way in which conception, when employed as an ingredient in other operations of the human mind, does not. This should not be surprising, though: once someone is able to think about something, even when he is not perceiving or remembering it, his mind will have established a certain grasp of that thing, classified and analyzed it, such that he will be able to think about it without using any of his other faculties. How this comes about will be better understood once Reid’s accounts of abstraction, judgment, and reasoning are presented, but it is already worth noting that it is not conception that supplies the mind with the most simple and exact notions the mind has of external things; these are acquired by using the mind’s superior reasoning powers (EIP IV. 1, p. 309).

Ideas as acts of the mind. Bare conception can be understood by analogy with painting, Reid argues, but he warns us that analogous thinking can take us only so far. Conception should be distinguished from painting, since “[t]he action of painting is one thing, the picture produced is another thing. The first is the cause, the second is the effect” (EIP IV. 1, p. 300). Reid’s worry is that that conception will be thought to work in the same way, to produce images of things in the mind, or ideas. Reid denies that this is the case, and puts forward a theory of ideas as acts of minds rather than objects of such mental operations: “Let this therefore be always remembered, that what is commonly called the image of a thing in the mind, is no more than the act or operation of the mind in conceiving it” (EIP, IV. 1, p. 300). To unpack this further, let us think about the elements involved in conceiving that the sun is yellow, for instance. Reid argues that in this act of conception, there are the following three elements: a mind, an act of conception that the sun is yellow, and the thing itself—the sun—external to the mind in question. Furthermore, he argues that there is something missing: an image in the mind, an additional representation, that has the explicit content of a yellow sun. He is willing to assert that this is just a verbal dispute, if everyone else is willing to agree with him that these images in the mind, or ideas, are nothing more than acts of conceiving—a moot point, given that everyone else was dead at the point when he was writing, and no one could have agreed with him. But, in effect, this is a serious conceptual point.

The analogy with painting should help classify conceptions into three classes, according to Reid. Just like a painter paints by using his imagination, by copying from other paintings, or by painting live subjects, there are conceptions which can be called “creatures of fancy”—like Don Quixote or Pegasus; conceptions of universals—which are analogous to paintings which copy other paintings; and conceptions of individual (existing) things—which are like paintings of live subjects.

Our conceptions, therefore, appear to be of three kinds: They are either the conceptions of individual things, the creatures of God; or they are conceptions of the meaning of general words; or they are the creatures of our own imagination. (EIP IV. 1, p. 305)

There are two issues worthy of attention in this classification: (i) Reid argues that people can name the creatures of fancy they invent, “conceive them distinctly, and reason consequentially concerning them, though they never had an existence” (EIP IV. 1, p. 301-2). And (ii) conceiving universals—like kinds and species of things—means nothing more nor less than to conceive the “meaning which other men who understand the language affix to the same words” (EIP IV. 1, p. 302). The first of these issues shows Reid to think that it is possible for fictional names to be used in the same way as regular names, even though the former category will be used to name nonexistents.

Reid’s Meinongeanism. Based on Reid’s idea that people can think and “reason consequentially” about fictional characters and objects, Nichols (2002) argued that Reid is a precursor of Meinong. Reid’s rejection of the way of ideas and his dedication to common sense philosophy are thought to amount to a rejection of the position according to which conceiving the nonexistent means nothing more than conceiving images or any other types of mental intermediaries. Centaurs, not centaur-inspired images or ideas, are the objects of such centaur thoughts. The only exception is constituted by a thought which is explicitly about a painting of a centaur, in which case it should be obvious to everyone that what is being conceived is an image, and not a mythological animal.

This one object which I conceive [a centaur], is not the image of an animal, it is an animal. I know what it is to conceive an image of an animal, and what it is to conceive an animal; and I can distinguish the one of these from the other without any danger of mistake. (EIP, IV. 2, p. 321)

Reid does not talk about different levels of existence; there is no doubt that centaurs do not exist as flesh-and-blood animals. It is important, however, to note that Reid ascribes intentionality to all the operations of the human mind, and this intentionality is to be resolved by understanding how conception works.

ii. Imagination

At the beginning of the EIP, when Reid is defining the terms he is going to use throughout the book, and at the beginning of the fourth Essay, where he lays down his views on conception, he claims that “conception” and “imagination” are synonymous words, and, moreover, that no reductive definition of these notions can be given, since they are supposed to denote simple operations of the mind. However, in the course of his analysis of conception, it becomes clear that imagination is not exactly the same thing as conception.

Reid argues that “imagination,” when used with its proper meaning, denotes a type of conception that is concerned primarily with the objects of sight (EIP IV. 3, p. 326). This restriction to sight probably has more to do with etymology than with the proper meaning of “imagination.” Imagination is supposed to apply to other senses, although Reid thinks that such uses are not altogether proper (EIP V. 6, p. 394). Any conception is of the imaginative kind when it is lively and about possible objects of sense. One consequence is that people can never be said to imagine universals, or propositions; neither are people supposed to think that anyone is imagining objects of sense, when they are actually perceiving them. A different kind of conception is responsible for the proper workings of perception.

Reid’s distinction between conception proper and imagination is one of the first instances in philosophy of mind in which imagination is presented as a faculty of the human mind related most closely to perception. Reid’s main breakthrough is his arguing that conception proper is used for understanding and acquiring general and abstract concepts, while imagination is used to think about things that might have existed, and, as such, might have presented beings endowed with such a faculty or system with perceptual stimuli.

b. Judgment and Reasoning

Reid dedicates two essays to the mental powers of judgment and reasoning with which he believes human beings to be endowed by nature. Essay VI, the one dedicated to judgment, presents the main elements of what Reid takes to be the philosophy of common sense. After a general introduction, in which he describes the fundamental characteristics of judgment, Reid argues that certain principles should be taken for granted as true. These are the first principles of common sense, which describe how the external and internal worlds work. These principles are self-evident and as such their truth cannot be demonstrated through any kind of reasoning. In the following essay, dedicated to reasoning, Reid argues that it is the purview of this faculty to produce judgments, or to combine and analyze them, in two main ways: deductively or probably. In what follows, these issues are discussed in turn, by first explaining what Reid thought about judgment, and then providing a schematic account of how deductive reasoning is supposed to be applied to the class of necessary truths, while probable reasoning is supposed to be applied to the class of contingent truths.

i. The Fundamental Characteristics of Judgment

Reid talks about judging in terms of offering mental assent or dissent to the issues represented by any particular judgment. Reid thinks that if human beings were not endowed with such an operation, they would not be able to reason abstractly. Without analyzing, abstracting, and judging when they reached correct conclusions, human beings would have been given reasoning in vain:

[S]ome exercise of judgment is necessary in the formation of all abstract and general conceptions, whether more simple or more complex; in dividing, in defining, and in general, in forming all clear and distinct conceptions of things, which are the only fit materials of reasoning. (EIP VI. 1, p. 413)

Some authors argue that judging should not be understood as involving just mental affirming or denial of its content, since that would not distinguish judging from believing. Although Reid’s official characterization of judgment is meant to clarify how this mental operation accompanies all others, belief already implies a mental assent/dissent given to its content. In the picture Reid is putting forward, there seems to be no way to explain why somebody would assent (dissent) to something without that person’s already having a belief that it is true (or false). Judgment, therefore, seems to presuppose belief. Judgment, then, would simply be superfluous, while belief would be ubiquitous, either as a concomitant or an ingredient in all other operations of the human mind (Rysiew (2004): 65). This, however, contradicts Reid’s official characterization of judgment:

[A] man who feels pain, judges and believes that he is really pained. The man who perceives an object, believes that it exists, and is what he distinctly perceives it to be; nor is it in his power to avoid such judgment. And the like may be said of memory, and of consciousness. Whether judgment ought to be called a necessary concomitant of these operations, or rather a part or ingredient of them, I do not dispute. But it is certain, that all of them are accompanied with a determination that something is true or false, and a consequent belief. If this determination be not judgment, it is an operation that has got no name; for it is not simple apprehension, neither is it reasoning; it is a mental affirmation or negation; it may be expressed by a proposition affirmative or negative, and it is accompanied with the firmest belief. (EIP VI. 1, p. 409)

To save Reid from this inconsistency, some have argued that the distinctive character of judgment emerges not from his official characterization of this mental operation, but rather from his comparing it to an external, real-life tribunal. This analogy is not perfect, and per Reid’s instructions (EIP I. 4, p. 55), people should not be lulled into a sense of confidence that they really know what they are talking about when they invoke analogous thinking, especially with regard to analogies concerning the body—or all things external—and mind—or all things internal. However, people are entitled to use the same name—“judgment”—to refer to both the process that results in an assenting/dissenting opinion in a court of law, and to the one that results in an assenting/dissenting belief in the internal tribunal, in virtue of the process involving reasoned reflection and deliberation. The fundamental characteristic of judgment in Reid’s system is its deliberative/reflective character, and not its relation to assent or dissent, which is, in turn, reserved for belief (Rysiew (2004): 67).

ii. Common Sense

Reid argues that sense and judgment are intrinsically related, such that sense always implies judgment: “A man of sense is a man of judgment” (EIP VI. 2, p. 424). He believes this to hold true both for what he calls “the external senses” (for instance, touch, taste, sight) and for the so-called “internal senses” (for instance, moral sense and internal taste). Since Reid believes (mistakenly, as it was discussed above) that judgment is the operation of the mind that helps people determine, “concerning any thing that might be expressed by a proposition, whether it be true or false” (EIP VI. 3, p. 435), and since he talks about common sense in the Essay dedicated to illuminating the nature of judgment, it should be obvious that he thinks that common sense is a specialized kind of judgment, understood as a faculty of the human mind. To wit, Reid thinks that common sense is that minimal degree of understanding that every adult human being possesses (or should possess), such that he can function well in this world. Common sense is concerned only with propositions that express self-evident truths (or falsehoods); judgment, more generally, is concerned with propositions that express any other kinds of truths or falsehoods.

Reid believes that self-evident principles are at the foundation of any kind of knowledge and that common sense is the mental operation that discovers such principles for human beings:

All knowledge, and all science, must be built upon principles that are self-evident; and of such principles, every man who has common sense is a competent judge, when he conceives them distinctly. Hence it is, that disputes often terminate in an appeal to common sense. (EIP VI. 2, p. 426)

This suggests that Reid thinks that human beings are all endowed with a mental operation—common sense—that is meant to discover the first principles upon which any kind of science is built. These first principles, when considered distinctly, namely in isolation from anything else, will be immediately found to be true, just as anything parading as a first principle, when considered distinctly, will be found to be false. No one undergoes a complicated reasoning procedure to discover the truth (or falsehood) of such principles; everyone just knows this, because, in being self-evident, these principles wear their truths conspicuously. In other words, what results from exercising the faculty of common sense is intuitive knowledge. Reid explains that reason and common sense do not conflict, because common sense is part of reason, just as judging does not oppose reason:

We ascribe to reason two offices, or two degrees. The first is to judge of things self-evident; the second to draw conclusions that are not self-evident from those that are. The first of these is the province, and the sole province of common sense; and therefore it coincides with reason in its whole extent, and is only another name for one branch or one degree of reasoning. (EIP VI. 2, p. 433)

Deduction from true principles can never contradict common sense, since “truth will always be consistent with itself” (EIP VI. 2, p. 433).

iii. First Principles of Common Sense

Reid thus believes that human beings are endowed with a faculty that gives them immediate knowledge of self-evident principles. He calls this faculty “common sense,” but it is more common to refer to the results of employing this faculty by the name of “intuitive knowledge.” The main idea here is that such knowledge of first principles is widespread: for instance, people are said to intuit axioms in mathematics and in logic; they also are thought to intuit first principles in morals, just as they intuit first principles regarding the expression of beauty in the arts, Reid believes. This knowledge is not innate; after all, as an Empiricist, Reid thinks that all knowledge is acquired. The faculty of common sense, just like all the other original faculties, is innate, in the sense that they are part of the mental architecture of a human being. The sense in which this intuitive knowledge is immediate, without it being innate is the following: once reasoning and the ability to process a human language are sufficiently developed, a human being will be able to know, non-inferentially, that certain propositions, when considered distinctly, are true.

Reid calls such propositions first principles, and he argues that they can be divided into two classes: first principles of contingent truths, on the one hand, and first principles of necessary truth, on the other. As Van Cleve (1999) points out, just because the former type of principles have contingent truths as their contents, this does not mean that the principles themselves are, in any way, less necessary than those of necessary truths. It is the truths themselves that are either necessary or contingent:

The truths that fall within the compass of human knowledge, whether they be self-evident, or deduced from those that are self-evident, may be reduced to two classes. They are either necessary and immutable truths, whose contrary is impossible, or they are contingent and mutable, depending upon some effect of will and power, which had a beginning, and may have an end. (EIP VI. 5, p. 468)

Since this article is concerned with the main tenets of Reid’s philosophy of mind, first principles are interesting for this purpose only in as much as they are discovered by a faculty—common sense—with which every human being is supposed to be endowed, and they will not be discussed in more detail.

iv. Reasoning

If the first principles of common sense are discovered by employing the operation of intuitive judging, reasoning proper is to be employed to discover whatever conclusions follow from self-evident principles. Since there are two classes of first principles, Reid argues that there are two types of reasoning. Demonstrative reasoning is employed to draw conclusions that follow from the first principles of necessary truths, whereas probable reasoning is employed to draw conclusions that follow from the first principles of contingent truths (EIP VII. 3, p. 556).

The strength of demonstrative reasoning, which is commonly employed in mathematics and logic, is such that for showing that a conclusion follows from some axioms (or first principles) nothing else needs to be done other than offering one demonstration. Reid thinks that it would be superfluous to try to give several different demonstrations to prove one conclusion, while employing demonstrative reasoning, even though a variety of proofs may be available in practice:

To add more demonstrations of the same conclusion, would be a kind of tautology in reasoning; because one demonstration, clearly comprehended, gives all the evidence we are capable of receiving. (EIP VII. 3, p. 556)

It is not so with probable reasoning:

The strength of probable reasoning …depends not upon any one argument, but upon many, which unite their force, and lead to the same conclusion. Any one of them by itself would be insufficient to convince; but the whole taken together may have a force that is irresistible, so that to desire more evidence would be absurd. (EIP VII. 3, p. 556)

Probable reasoning is the method of choice for all the natural sciences, whose true propositions are contingent. According to Reid, probable reasoning comes in degrees, whereas demonstrative reasoning does not admit degrees; it is absolute.

In every step of demonstrative reasoning, the inference is necessary, and we perceive it to be impossible that the conclusion should not follow from the premises. In probable reasoning, the connection between the premises and the conclusion is not necessary, nor do we perceive it to be impossible that the first should be true while the last is false. (EIP VII. 1, p. 544-45)

Although Reid argues that probable reasoning is of a different kind than demonstrative reasoning (EIP VII. 3, p. 557), according to Lehrer (1989: 174), probable reasoning can lead to conclusions that are certain. Reid thinks that the vulgar is mistaken when contrasting probable reasoning with certainty. Probable reasoning, according to Reid, has degrees of evidence, “from the very least to the greatest which we call certainty” (EIP VII. 3, p. 557).

Hume, in the Treatise, argues that all knowledge should be reduced to probability, because human beings are fallible creatures, endowed with fallible faculties. Reid’s understanding of probable reasoning as a type of reasoning that leads to certain conclusions constitutes a direct refutation of Hume’s argument. The problem, Reid points out, is that requiring a proof of the reliability of the human faculties would be circular, because it could be given only by using those reasoning powers themselves, “and is therefore that kind of sophism which Logicians call petitio principii” (or “begging the question”) (EIP VII. 4, p. 571). Hume writes that “[n]ature, by an absolute and uncontrollable necessity, has determined us to judge, as to breathe and feel” (Hume, Treatise I.iv.1, p. 183). Reid agrees with Hume in part: probable reasoning concerning cause and effect, for instance, is the result of an innate principle of human constitution. Such a principle is known to be true, by intuition, and by exercising the faculty of common sense. But Reid also disagrees with Hume, and points out that probable reasoning concerning cause and effect is not merely a matter of custom. The relevant first principle of contingent truth allows human beings to be certain that effect follows its cause, not because they reason that it is so, but because they judge (intuitively) that it is so.

5. Taste

Reid considers the principles of the so-called “internal taste” in Essay VIII, the last of the EIP. Contemporary philosophy of mind is mostly silent concerning the way human beings interact and appreciate works of art; the widespread belief seems to be that such issues belong to value theory rather than to the philosophy of mind proper. Reid, however, is part of a different tradition, which sought to explain the interest humans have in art and its artifacts, and consequently the interactions humans seek with said artifacts starting by observing human psychology. As such, he, just like some of his predecessors (for example, Hume, Hutcheson, and Shaftesbury), thinks that adult human beings are endowed with a special faculty, taste, which is supposed to help them appreciate beautiful or aesthetically relevant things, and disapprove those that are found to be lacking the sought-after qualities. Reid is thus mostly describing and analyzing the aesthetic experience, rather than addressing issues that are relevant from the point of view of the philosophy of art. In the course of doing this, however, he is interested in questions pertaining to art and artworks. Reid has an expression theory of art, in that he is interested in how art can express emotion, or, better still, how artists can and do express emotions through an artistic medium. If art is a sort of language, the faculty of taste, as applied to the aesthetic qualities of artworks, is the way to be made privy to this language: by employing this faculty, human beings become sensitive to the signs and decode their meaning. However, this is not the only way people employ their internal sense: by using this faculty they also become sensitive to the aesthetic qualities of the world. Reid’s idea is that just like a painter is expressing an emotion in his works, God is expressing certain emotions in his works. One cannot gain complete knowledge of the external world, in this picture, unless one understands and appreciates the beauty of the world.

a. Why This Faculty Is Called “Internal Taste”

This name indicates that the faculty itself is of the same kin as the other type of taste, but in what sense is it “internal”? To better understand this, consider the distinction that Reid draws between things internal and things external to the mind at the beginning of the EIP:

When…we speak of things in the mind, we understand by this, things of which the mind is the subject. Excepting the mind itself, and things in the mind, all other things are said to be external. (EIP I. 1, p. 22)

This distinction is as elucidating as it is confusing: since both types of taste are operations of the mind, they both are, in a sense, internal. However, Reid’s idea is that the “external taste” is supposed to help those beings that have it register information about certain pleasing and displeasing qualities of food and drink. The objects that can be food and drink are external to the mind—they are physical things to be found in the world. So, by analogy, it should probably be thought that “the internal taste” is supposed to help those beings that are endowed with it register information about certain pleasing and displeasing qualities of internal objects—namely, minds and their qualities.

Reid does not argue that other minds can be directly perceived, but he takes it to be a first principle of common sense that other minds exist (the 8th first principle of contingent truths, EIP VI. 5, p. 482-483), and that people learn of their existence by correctly deciphering certain signs. This interpretation of natural signs is innate, since, Reid claims, even small children respond in the correct (that is, expected) way in the presence of an angry parent, for instance. In this picture, the internal sense of taste is meant to discern the quality of excellence that other minds possess, in addition to enhancing the knowledge people have of “the existence of life and intelligence in our fellow-men.” To do so, however, the internal taste orients itself to material objects (since it cannot directly interact with other minds), and identifies that which is beautiful, in nature and in the fine arts (EIP VIII. 1, p. 573).

b. An Objectivist Account of Beauty

Putting everything together, here is the picture that emerges: Reid believes that beauty is a property both of objects and of minds. Moreover, he thinks that beauty itself is both a primary and a secondary quality of objects. Reid’s claim that beauty is a real property of objects directly opposes the idea that beauty is just a feeling in an agent’s mind, advanced by Hume and Hutcheson. As in morals, in the domain of aesthetic value, Reid is an objectivist (at least, according to Benbaji (1999)). The aesthetic (or internal) taste has the dual role of discovering what material objects are beautiful, and, indirectly, what minds, which created those beautiful objects, are inherently beautiful. Beauty, in this picture, is not a feeling in one’s mind, but something external to one’s mind. The internal taste is used to reach aesthetic judgments by evaluating material objects, which express the mental attributes of the artist. Without excellence in the mind, no product of that mind can be perceived as beautiful. Beauty is thus a property of the artist’s mind, and is displayed by the artifacts he creates only in a derivative sense. The internal taste functions very much like perception of external objects: certain signs of aesthetic qualities function to trigger a conception and belief in the existence of the aesthetic quality in question. The internal taste is thus assimilated to the external sense of taste, since both senses are supposed to contribute to the perception of specific qualities of objects.

6. References and Further Reading

a. Primary Sources

Hume, D. (2007). A Treatise of Human Nature. Oxford: Clarendon Press. (Original work published in 1739-40.)
- The standard edition of Hume’s Treatise.
Hume, D. (1874-75). “Of the Standard of Taste,” in vol. 3 of The Philosophical Works of David Hume. Edited by T. H. Green and T. H. Grose. 4 volumes, London: Longman, Green.
- Hume considers whether there can be any objective standard of taste.
Hutcheson, F. (2004). An Inquiry into the Original of Our Ideas of Beauty and Virtue. Edited by W. Leidhold. Indianapolis: Liberty Fund. (Original work published in 1726.)
- This presents Hutcheson’s sentimentalist understanding of beauty.
Locke, J. (1979). An Essay Concerning Human Understanding. Oxford: Clarendon Press. (Original work published in 1700.)
- This is the standard edition of Locke’s Essay.
Reid, T. (1997) An Inquiry into the Human Mind on the Principles of Common Sense. Edited by Derek R. Brookes. Edinburgh, UK: Edinburgh University Press. (Original work published in 1764.)
- This is the standard edition of Reid’s Inquiry. Cited in text as IHM, chapter, section, page number. Cited in text as Essay, book, chapter, section number.
Reid, T. (2002) Essays on the Intellectual Powers of Man—A Critical Edition. Edited by Derek R. Brookes. Edinburgh, UK: Edinburgh University Press. (Original work published in 1785.)
- This is the standard edition of Reid’s work on the intellectual powers. Cited in text as EIP, essay, chapter, page number.
Reid, T. (2010) Essays on the Active Powers of Man—A Critical Edition. Edited by Knud Haakonssen and James A. Harris. Edinburgh, UK: Edinburgh University Press. (Original work published in 1788.)
- This is the standard edition of Reid’s published work on action theory.

b. Secondary Sources

Alston, W. P. (1989). “Reid on Perception and Conception.” In M. Dalgarno, & E. Matthews (Eds.) The Philosophy of Thomas Reid, (pp. 35–47). Dordrecht: Kluwer.
- Argues that conception, despite its name, does not involve the use of any concepts.
Benbaji, H. (1999). “Reid’s View of Aesthetic and Secondary Qualities.” Reid Studies 2, 31-46.
Buras, T. (2005). “The Nature of Sensations in Reid.” History of Philosophy Quarterly, 22(3), 221–238.
- Interprets Reid as saying that sensations are reflexive acts of the mind, taking themselves as objects.
Buras, T. (2008). “Three Grades of Immediate Perception: Thomas Reid’s Distinctions.” Philosophy and Phenomenological Research, 76(3), 603–632.
- Explains that there are three senses of “immediacy,” in Reid, making clear the connection between immediacy and original perception, and acquired perception.
Buras, T. (2009). “The Function of Sensations in Reid.” Journal of the History of Philosophy, 47(3), 329–353.
- Explains what function sensations perform: primarily, they give sentient beings information about how they react to the environment.
Copenhaver, R. (2000). “Thomas Reid’s Direct Realism.” Reid Studies, 4(1), 17–34.
- Explains Reid’s account of perception, classifying it as direct realism.
Copenhaver, R. (2004). “A Realism for Reid: Mediated but Direct.” British Journal for the History of Philosophy, 12(1), 61–74.
- Explains the intermediary role of sensations in the chain of perception.
Copenhaver, R. (2010). “Thomas Reid on Acquired Perception.” Pacific Philosophical Quarterly, 91(3), 285–312.
- Offers a compelling argument to show that acquired perception is indeed a form of perception, and not reasoning.
Copenhaver, R. (2006a). “Thomas Reid’s Philosophy of Mind: Consciousness and Intentionality.” Philosophy Compass, 1(3), 279–289.
- Offers a comprehensive explanation of Reid’s philosophy of mind, centered on the concept of intentionality.
Copenhaver, R. (2006b). “Thomas Reid’s Theory of Memory.” History of Philosophy Quarterly, 23(2), 171–187.
- Discusses the ways in which memory gives people direct knowledge of the past, according to Reid.
Copenhaver, R. (2009). “Reid on Memory and Personal Identity.” Stanford Encyclopedia of Philosophy. http://plato.stanford.edu/entries/reid-memory-identity/
- Offers a comprehensive account of Reid’s theory of memory.
Falkenstein, L. (2004). “Nativism and the Nature of Thought in Reid’s Account of Our Knowledge of the External World”. In T. Cuneo, & R. Van Woudenberg (Eds.), The Cambridge Companion to Reid, (pp. 156–179). Cambridge: Cambridge University Press.
- Explains Reid’s brand of nativism, which allows him to keep fixed certain principles which are dear to the British Empiricists.
Falkenstein, L. and Giovanni Grandi (2003). “The Role of Material Impressions in Reid’s Theory of Vision: A Critique of Gideon Yaffe’s ‘Reid on the Perception of the Visible Figure.’’’ Journal of Scottish Philosophy, 1(2), 117-133.
- Argue that no sensations are involved in the perception of visible figure.
Folescu, M. (2015). “Perceiving Bodies Immediately: Thomas Reid’s Insight.” History of Philosophy Quarterly, 32(1), 19–36.
- Argues that bodies are objects of original perception, despite perceivers’ gaining only relative (that is, not direct) notions of them by the use of their senses.
Folescu, M. (2015). “Perceptual and Imaginative Conception.” In Todd Buras and Rebecca Copenhaver (eds.), Mind, Knowledge and Action: Essays in Honor of Reid’s Tercentenary, (pp. 52–74). Oxford: Oxford University Press.
- Argues that Reid should have been sensitive to the fact that conception is not employed in the same manner by the perceptual and by the imaginative systems, respectively.
Folescu, M. “Thinking About Different Nonexistents Of The Same Kind.” Published online first in Philosophy and Phenomenological Research. DOI: 10.1111/phpr.12196
- Argues that Reid’s account provides the tools for entertaining singular imaginings of different fantastical creatures of the same kind.
Gallie, R. (1997). “Reid: Conception, Representation and Innate Ideas.” Hume Studies, 23(2), 315-35.
- Argues that conception requires linguistic representation.
Ganson, T. (2008). “Reid’s Rejection of Intentionalism.” Oxford Studies in Early Modern Philosophy, 4, 245–263.
- Argues that sensation is not intentional: it is not about any objects, be those objects the sensations themselves.
Kivy, P. (2004). “Reid’s Philosophy of Art.” In T. Cuneo, & R. Van Woudenberg (Eds.) The Cambridge Companion to Reid, (pp. 267–312). Cambridge: Cambridge University Press.
- Argues that Reid is one of the first philosophers interested in philosophy of art, rather than aesthetics, in general.
Kivy, P. (1978). “Thomas Reid and the Expression Theory of Art.” The Monist, 61(2), 167–183.
- Argues that Reid has, primarily, an expression theory of the arts: artworks express the emotions of their creators.
Kroeker, E. R. (2010). “Reid on Natural Signs, Taste and Moral Perception.” In S. Roeser (Ed.), Reid on Ethics: Philosophers in Depth, (pp. 46–66). Palgrave Macmillan.
- Argues that original beauty and other aesthetic qualities are intrinsic qualities of minds.
Lehrer, K. (1978). “Reid on Primary and Secondary Qualities.” The Monist, 61(2), 184–191.
- Presents and defends the distinction between these two types of properties of objects.
Lehrer, K. (1989). Thomas Reid. London and New York: Routledge.
- Offers a comprehensive exposition of Reid’s philosophy.
Manns, J. W. (1988). “Beauty and Objectivity in Thomas Reid.” British Journal of Aesthetics, 28, 119–131.
- Argues that beauty is objective, for Reid, on the principles of common sense, but not objective, on the correct philosophical principles.
Nauckhoff, J. C. (1994). “Objectivity and Expression in Thomas Reid’s Aesthetics.” Journal of Aesthetics and Art Criticism, 52, 183–191.
- Argues that minds are excellent, hence beautiful, and that any other object deemed beautiful has that quality in virtue of being a sign of some excellence.
Nichols, R. (2002). “Reid on Fictional Objects and The Way of Ideas.” The Philosophical Quarterly, 52(209), 582–601.
- Argues that Reid’s rejection of the “way of ideas” leads him to adopt a form of moderate Meinongeanism, before Meinong.
Nichols, R. (2007). Thomas Reid’s Theory of Perception. Oxford: Oxford University Press.
- Analyzes the major tenets of Reid’s theory of perception.
Pappas, G. S. (1989). “Sensation and Perception in Reid.” Noûs, 23(2), 155–167.
- Defends the distinction between sensation and perception in Reid; a classic piece in Reid studies.
Rysiew, P. (1999). “Reid’s [Mis]charaterization of Judgment.” Reid Studies 3(1), 63–68.
- Argues that, despite his official characterization, “judgment,” for Reid, should be understood to mean reflection.
Tulving, E. (1983). Elements of Episodic Memory. Oxford: Oxford University Press.
- Explains what types of memory there are, and why episodic memory is fundamental.
Van Cleve, J. (1999). “Reid on the First Principles of Contingent Truths.” Reid Studies 3, 3–30.
- Argues that the first principles of contingent truths allow Reid to be a reliabilist with regard to the cognitive faculties of human beings, without any kind of circularity.
Van Cleve, J. (2004). “Reid’s Theory of Perception.” In T. Cuneo, & R. Van Woudenberg (Eds.) The Cambridge Companion to Reid, (pp. 101–133). Cambridge: Cambridge University Press.
- A comprehensive account of Reid’s theory of perception, with special care given to identifying Reid’s type of realism: direct or indirect. This is the best starting point for anyone interested in getting a better understanding of Reid’s theory of perception.
Van Woudenberg, R. (1999). “Thomas Reid on Memory.” Journal of the History of Philosophy, 37(1), 117–133.
- Discusses the elements of Reid’s theory of memory.
Van Woudenberg, R. (2004). “Reid on Memory and the Identity of Persons.” In T. Cuneo, & R. Van Woudenberg (Eds.) The Cambridge Companion to Thomas Reid, (pp. 204–221). Cambridge: Cambridge University Press.
- Discusses the role of memory in personal identity.
Wolterstorff, N. (2001). Thomas Reid and the Story of Epistemology. Cambridge: Cambridge University Press.
- Explains Reid’s terminology and way of thinking such that contemporary epistemologists can see Reid as an exponent and precursor of some of the issues discussed today.
Yaffe, G. (2003a). “The Office of an Introspectible Sensation: A Reply to Falkenstein and Grandi.” Journal of Scottish Philosophy, 1(2), 135–140.
- Responds to the criticisms raised by Falkenstein and Grandi to the idea that all kinds of perceptions, including the perception of visible figure, involve sensations.
Yaffe, G. (2003b). “Reid on the Perception of Visible Figure.” Journal of Scottish Philosophy, 1(2), 103–115.
- Argues that perceiving the visible figure of objects, for Reid, involves having sensations of color.

Author Information

Marina Folescu
Email: folescum@missouri.edu
University of Missouri
U. S. A.

Demonstratives and Indexicals

In the philosophy of language, an indexical is any expression whose content varies from one context of use to another. The standard list of indexicals includes pronouns such as “I”, “you”, “he”, “she”, “it”, “this”, “that”, plus adverbs such as “now”, “then”, “today”, “yesterday”, “here”, and “actually”. Other candidates include the tenses of verbs, adjectives such as “local”, and a range of expressions such as “yea” or “so” as used in constructions such as “yea big” (said, for example, while holding one’s hands two feet apart). Certain indexicals, often called “pure indexicals”, have their content fixed automatically in a context of use in virtue of their meaning. “I”, “today”, and “actually” are common examples of pure indexicals. Other indexicals, often called “true demonstratives,” require some kind of additional supplementation in a context in order to successfully refer in the context. The demonstrative pronouns “this” and “that” are clear examples of true demonstratives, because they require something of the speaker—some kind of gesture, or some kind of special intention—in order to resolve what the speaker is referring to. Which expressions are pure indexicals and which are true demonstratives is itself a matter of controversy. (The terms “pure indexical” and “true demonstrative” are due, as with so much else on this topic, to David Kaplan.)

Contemporary philosophical and linguistic interest in indexicals and demonstratives arises from at least four sources. (i) Indexical singular terms such as “I” and true demonstratives such as “that” are perhaps the most plausible candidates in natural language for the philosophically controversial theory of direct reference (see section 3e). (ii) Indexicals and demonstratives provide important test cases for our understanding of the relationship between linguistic meaning (semantics) and language use (pragmatics). (iii) Indexicals and demonstratives raise interesting technical challenges for logicians seeking to provide formal models of correct reasoning in natural language. (iv) Indexicals raise fundamental questions in epistemology about our knowledge of ourselves and our location in time and space.

By far the most influential theory of the meaning and logic of indexicals is due to David Kaplan. Almost all work in the philosophy of language (and most work in linguistics) on indexicals and demonstratives since Kaplan’s seminal essay “Demonstratives” has been a development of or response to Kaplan’s theory. For this reason, the majority of this article focuses on the details of Kaplan’s theory. Before introducing Kaplan’s theory, however, it discusses the most important precursors to Kaplan, some of whose views have been revived and given new defenses in light of Kaplan’s work.

Some Preliminaries
Precursors to Kaplan’s Theory
Kaplan’s Semantic Theory of Indexicals
True Demonstratives
Kaplan’s Logic of Indexicals
1. The Core Idea of Kaplan’s Logic
2. Kaplan’s Other Semantic Theory
Objections to Kaplan’s Semantic Theory and Logic
Alternatives to Kaplan’s Theory of Indexicals
1. John Perry’s Reflexive-Referential Theory
2. Expression-Based Alternatives
References and Further Reading

1. Some Preliminaries

Indexicals are words or phrases. To talk carefully about them, we need some resources for talking carefully about words and phrases. There are more distinctions here than may be apparent at first glance. In the case of indexicals and demonstratives, some of these distinctions are crucial.

a. Expressions and Utterances

Suppose that a speaker, Greg, utters the sentence “I am hungry”. We can distinguish between the action that Greg has performed—the utterance—and the sentence or expression that Greg has uttered. If Molly also utters “I am hungry”, then Molly and Greg have uttered the same sentence, but they have performed different actions. There is also a way of talking about actions on which we can say that Molly and Greg have performed the same action—they have both uttered “I am hungry”—but this is not the way we will talk about actions here. As we will use the term, an utterance is a particular event that occurs at a particular time and place. In this sense, Greg’s utterance and Molly’s utterance are distinct events, because they occurred at different places (and perhaps at different times).

We will also generalize our use of “utterance” so that it refers to inscriptions—acts of writing sentences—as well as to acts of speaking. So if Greg and Molly each write “I love you” on a sheet of paper, we will say that they have performed different (though similar) utterances. Yet in this case as well, they have written the same sentence. This slight extension of the standard use of “utterance” is common in discussions of indexicals and demonstratives. As we will see below, written notes provide interesting test cases for certain theories of indexicals.

b. Types and Tokens

It is also important to distinguish an utterance from the particular concrete instance of a sentence, word, or phrase that is produced or used in the course of an utterance. This distinction is easiest to see in the case of writing, where an act of writing produces some concrete thing—ink or graphite marks on a page, chalk marks on a blackboard, a specific distribution of pixels on a screen, and so forth. Following Charles Sanders Peirce, philosophers call these concrete instances of words, phrases, or sentences tokens. Tokens can also take the form of particular patterns of sound, as in the case of spoken language, and here again, it is important to distinguish the act of producing a particular pattern of sound—an utterance—from the particular pattern of sound produced—a token.

In our examples involving Greg and Molly above, we said that Greg and Molly each uttered the same sentence. This means that what we are calling the sentence that Greg and Molly both uttered is not the same thing as either of the tokens that they have produced. Again following Peirce, we will say that the tokens that Greg and Molly have each produced are instances or tokens of the same sentence type. Similarly, Greg and Molly have each produced tokens of the word type “I”. While tokens are concrete things, types are abstract. While tokens are located in particular places in space and time, types are not located anywhere.

The precise status and nature of types is a difficult question. Here are just two examples of the kinds of puzzles that arise when one begins to think about types versus tokens. (i) Are types universal? They seem to be, given that they are abstract objects that are in some sense instantiated by their tokens. (ii) In virtue of what are two tokens of the same type? In some cases, this may seem straightforward: if you are viewing this article on two different screens (or perhaps on a screen and a printed copy), you see two tokens of this sentence that are orthographically very similar to one another. But what about a token of “I am hungry” written by hand in a cursive script on a piece of paper, and another produced by Greg speaking the sentence? In virtue of what are these tokens of the same type? They have little in common in virtue of which we can say that they are similar. Despite these difficulties, we will continue to talk about tokens and types in the ways outlined above.

In what follows, the terms “word”. “phrase”, “sentence”, and “expression” will refer to types. Whenever we need to refer to particular tokens, we will use phrases such as “the token of the sentence ‘I am hungry’ produced by Greg”. Some philosophers are not always as careful about this usage as they could be, and anyone who wants to read further in the literature on the topic is warned to pay careful attention to how different philosophers talk about language. Some philosophers use a convention whereby putting a numeral before a sentence, as in the case of (1), allows them to use that numeral to refer to the sentence.

(1) I am hungry.

This is the convention that we follow in this article. Thus, our examples above involve different cases of speakers uttering (1) by producing different tokens of it. But other philosophers will use the numeral to refer to some hypothetical utterance of the sentence, and others to the token produced in such an utterance.

One further point is sometimes important in discussions of indexicals: an utterance of a sentence need not involve the production of a token of that sentence. For example, I might write a note on the top sheet of a Post-it pad that says “I will return at 2:30” and post it on my office door. Here I have produced a token of the sentence “I will return at 2:30”. But next week, I might use the same sheet again, by reposting it to my office door. This time, it seems that I am uttering the sentence “I will return at 2:30” by using a token that I produced earlier. Thus, when we speak of utterances, we will also mean to include cases like this in which an agent uses a previously produced token.

c. Occurrences

The distinctions above, between utterances and expression types and tokens, are common in discussions of language. There is one other category, however, that should be borne in mind when thinking about indexicals and demonstratives. To see this, consider first the kind of question that is commonly used to introduce the distinction between types and tokens:

How many words are written between the following pair of tokens of quotation marks: “a rose is a rose”?

The question here may be taken in different ways: three words have been written, but two of those words have been written twice. Thus, in the token of “a rose is a rose” above, there are two tokens of “a” and two of “rose”. So if you were to mark off the number of times that any token of a word appears between the tokens of the quotation marks above, you would count five tokens.

Now consider the question

How many words are in the sentence “a rose is a rose”?

Here there is only one correct answer. There are three words in the sentence: “a”, “rose”, and “is”. We can, however, say something else: two of these words occur twice in the sentence. This is not to say that there are two tokens of “a” and of “rose” in the sentence. That would be a mistake: the sentence is an abstract type, and tokens are concrete particulars. Instead of distinguishing between different tokens of “a” and of “rose” in the sentence, we distinguish between the different occurrences of “a” and of “rose” in the sentence. So there are three words in the sentence, but there are five occurrences of words

Occurrences, like types, but unlike tokens, are abstract. An occurrence of a word or phrase e within a larger phrase E may be thought of as a state of affairs: the state of affairs of e being located at a particular place in the structure of E. Thus, the two occurrences of “rose” in “a rose is a rose” are distinguished from each other according to where in the structure of “a rose is a rose” the word “rose” is located.

Despite the importance of distinguishing between occurrences and tokens, there are systematic relations between them. It is precisely because the sentence “a rose is a rose” contains two occurrences of “rose” that any token of the sentence will contain two tokens of “rose”. This relation will be important when we turn to theories of true demonstratives.

2. Precursors to Kaplan’s Theory

In the 20th century, there have been two basic approaches to the semantics of indexicals and demonstratives: utterance-based and expression-based theories. Almost all of the theories prior to David Kaplan’s influential theory have been utterance based. In early attempts to elaborate such theories, however, philosophers did not always pay due attention to the distinction above between utterances and the tokens produced (or used) in those utterances. The below discussion largely follows the original philosophers’ terminology, departing from it only to clarify where it is important to point out that they have elided the distinction between tokens and utterances.

a. Peirce on Indexical Signs

The term “indexical” is due originally to Charles Sanders Peirce, who introduced it as part of a threefold theory of signs. In this theory, Peirce distinguished between icons, indices, and symbols. All signs, on Peirce’s view, have the basic function of representing some object to some cognitive agent, but different kinds of signs accomplish this function in different ways. Icons represent an object to an agent by exhibiting or displaying to the agent the properties of the object they represent. A clear example of this is a diagram of a machine, which represents visually both the shapes of the parts and the structure of the machine.

Indices represent by standing in some kind of intimate relation to their objects. Peirce calls these relations “existential relations”, because indices cannot represent objects unless those objects exist to stand in the appropriate relations to them. Indices are a fundamental part of Peirce’s theory, but for Peirce, existential relations are easy to come by. This is because many causal relations count, for Peirce, as existential relations. As an example of an index, Peirce considers a hole in a wall: one can infer from the hole the existence of a gunshot in the room. Thus, the hole is an index of the gunshot.

As this example makes clear, indices in Peirce’s theory by themselves have little to do with language, or indeed with representation in any obvious sense. Indices in Peirce’s theory exhibit what H. P. Grice would later call natural meaning, wherein the presence of one state of affairs is a reliable indicator of the presence of another. Grice’s famous examples include that smoke means fire, and that presence of a certain rash means measles. Yet neither of these cases is plausibly an example of representation: the presence of smoke does not represent the presence of fire, nor does the presence of a particular rash represent the presence of measles.

Symbols, finally, represent their objects in virtue of conventions or rules that state that they stand in for those objects. Thus on Peirce’s view, all words of a language are symbols, because all words have their meanings conventionally. But some words are also indices. Peirce cites the demonstrative pronouns “this” and “that” as examples. On Peirce’s view, the conventional rules governing “this” and “that” dictate that a speaker can use them to refer to objects in the immediate perceptual environment. The audience of a successful use of a demonstrative can infer the existence of an object referred to—an “existential” relation. If the audience cannot infer the existence of an object referred to, then the use of a demonstrative has not been successful. Thus, demonstrative pronouns are both symbols (governed by conventional rules) and indices (representing objects in virtue of the existential relations they bear to those objects).

b. Russell on Egocentric Particulars

Bertrand Russell calls words like “I”, “here”, “now”, and so forth egocentric particulars. In Russell’s theory, all such expressions can be analyzed as descriptions involving the demonstrative pronoun “this”. So, for Russell, “now” means “the time of this” and “here” means “the place of this”. Russell offers different analyses of “I”, proposing at one time that it means “the person experiencing this”, and at another time that it means “the biography to which this belongs”. Thus on Russell’s analysis, all egocentric particulars can be reduced to one, and the status of egocentric particulars turns on the status of “this” (about which Russell held conflicting views at different times). According to Russell, this analysis of egocentric particulars captures an important feature of their use: that the reference (or denotation) of a particular utterance of an indexical is always relative to the speaker (and perhaps the time) of the utterance.

Yet Russell’s analysis fails on precisely the grounds that the interpretation of a particular utterance of “this” is not fixed merely by the identity of the speaker and the time of the utterance. This is because, as we see later, speakers can use “this” to refer to different things in their immediate environment. What a speaker refers to using “this” depends on some further feature of the context of the use: either the speaker makes some gesture, or there is enough common knowledge in the background that the speaker’s audience can identify what object the speaker intends to refer to (see section 4b below).

c. Reichenbach on Token-Reflexives

One of the most developed and influential theories of indexicals prior to Kaplan is due to Hans Reichenbach. Reichenbach’s theory is, in many ways, similar to Russell’s, but Reichenbach offers both a more sophisticated analysis of individual indexical expressions, and a more subtle treatment of the principles underlying the analysis. The key to both of these is Reichenbach’s emphasis on tokens in his analysis.

Reichenbach calls indexical expressions “token-reflexives”. The reason for this is clear on even an informal statement of Reichenbach’s view: the indexical “I” means “the person who utters this token”, “here” means “the place at which this token is uttered”, “now” means “the time at which this token is uttered”, and so forth. Token-reflexive expressions are thus expressions whose meaning is in some way keyed to individual tokens of them. (Though Reichenbach’s official theory is stated in terms of types and tokens, some passages in Reichenbach’s Elements of Symbolic Logic suggest that he was thinking of utterances rather than tokens. Contemporary defenders of Reichenbach-inspired views adopt this variation—see section 7a and García-Carpintero.)

Even on this informal statement, Reichenbach’s view clarifies to some degree the role of “this” in Russell’s analysis of egocentric particulars: a particular utterance of an indexical must refer to a token. Yet without further elaboration, this statement of Reichenbach’s view would be subject to the same problem as Russell’s, because it is undetermined which token is supposed to be referred to. If I utter “I am the person who uttered this token”, while pointing at a token of a sentence that someone else wrote on a chalkboard, then I have said something false.

This worry is allayed by a closer examination of the details of Reichenbach’s analysis. Suppose that Bertrand Russell utters (2):

(2) I am a philosopher.

In so doing, Russell has produced a token of “I”. Call this token t₁. Then on a more careful statement of Reichenbach’s view, Russell’s utterance of (2) means the same thing as (3):

(3) The person who utters t₁ is a philosopher.

Since Russell is the person who utters t₁, and Russell is a philosopher, Russell’s utterance is true. This shows that our rough translation of “I” above as “the person who utters this token” was incomplete. It is more correct (though on Reichenbach’s view, still not strictly correct—see below) to say that the meaning of “I” is such that any token t of “I” refers to t itself. Thus unlike Russell, who reduced all indexicals—Russell’s egocentric particulars—to the demonstrative pronoun “this”, Reichenbach reduces all indexicals—Reichenbach’s token-reflexives—to a very special kind of token-reflexive operation.

The token-reflexive operation that forms the basis of Reichenbach’s analysis is the special technical device of “token-quotes”—the pair of arrows “^↓” and “^↓” that Reichenbach introduces in his analysis of the phrase “this token”. For Reichenbach, the result of enclosing a token in token-quotes, as in

^↓a^↓,

produces a token that refers to the token of “a” enclosed in the quotes. The emphasis on “token” in the previous sentence is important, because the token below refers to a different token of “a”:

^↓a^↓.

Call these “token-quote phrases”. The above examples show that on Reichenbach’s view, no two tokens of a token-quote phrase can refer to the same thing. As a result, we cannot talk about the meaning of a token-quote phrase, because there is no meaning that any two tokens of the phrase share. For this reason, Reichenbach calls token-quote phrases “pseudo-phrases”. Since token-quote phrases are the foundation of Reichenbach’s analysis of indexicals, all indexicals are similarly pseudo-phrases. As a result, it is strictly speaking incorrect, on Reichenbach’s view, to talk about the meaning of an indexical.

One consequence of this view is that different utterances of (2), even by the same person, will strictly speaking mean different things. Suppose that Russell utters (2) a second time. In doing so, Russell has produced a separate token of “I”. Call this token t₂. On Reichenbach’s view, Russell’s second utterance of (2) means the same thing as (4):

(4) The person who utters t₂ is a philosopher.

This consequence of Reichenbach’s view is counter to our intuitions about the use of (2): if Russell uses (2) twice, Russell has said the same thing about himself. On Reichenbach’s view, Russell said two different things about two different tokens of “I”. Yet because in both cases, it was Russell who did the uttering, the truth of what Russell said in each case turns on whether Russell a philosopher. Thus, Reichenbach’s analysis gets the right truth conditions for an utterance of (2), but at the expense of certain intuitions about the meaning of “I”.

Reichenbach’s view has a further odd consequence, noted by David Kaplan. Suppose that I utter (5), and let “t₃” name the token of “I” that I have produced in so doing:

(5) If no one were to utter t₃, then I would not exist.

According to Reichenbach’s analysis, my utterance of (5) means the same thing as (6):

(6) If no one were to utter t₃, then the person who utters t₃ would not exist.

But (6) is plausibly a logical truth. Thus on Reichenbach’s view, my utterance of (5) is true as a matter of logic. Yet my utterance of (5) is clearly false: had I not uttered (5), I would nonetheless have continued to exist.

d. Burks on Indexical Symbols

In the article “Icon, Index, and Symbol”, Arthur Burks develops Peirce’s suggestive remarks about indexical words into a more systematic theory of their meanings. Burks’s theory also addresses some of the odd consequences of Reichenbach’s theory noted above (though it is unclear whether Burks was familiar with Reichenbach’s view). Thus, Burks’s theory represents a culmination of several different strands of thought concerning indexicals prior to Kaplan’s work.

On Burks’s theory, all expression types of a given language have what Burks calls symbolic meaning. This is the meaning of the expression type determined by the conventions governing the language. All tokens of a given expression type share the symbolic meaning of the type. The difference between indexical expressions and non-indexical expressions is in the meanings of individual tokens. For non-indexical expressions, the meaning of an individual token just is the symbolic meaning of the type of which it is a token. For indexical expressions, in contrast, the symbolic meaning of the expression type is only part of the meaning of each individual token of that type. The full meaning of a token of an indexical expression includes information about the token itself—where and when it exists, who produced it, and so forth. Burks calls this full meaning of a token of an indexical expression the indexical meaning of the token. So different tokens of an indexical expression differ in indexical meaning, but their different indexical meanings all have the symbolic meaning of the indexical expression in common.

For Burks, the indexical meaning of a token is what someone must know about that token in order to determine what that token represents. On Burks’s view, the indexical meaning of a token of an indexical expression comprises all of the following:

(i) the spatiotemporal location of the token;

(ii) a description of the object that the token represents; and

(iii) a set of what Burks calls “directions” that relate the token to the object it represents.

The directions in (iii) can arise in two different ways, either (a) as encoded in the symbolic meaning of the type of which the token is an instance, or (b) as determined by an act of pointing, or some similar gesture on the part of the person who produces or uses the token. Elements (ii) and (iiia) of the indexical meaning of a token are supplied by the symbolic meaning of the type of which the token is an instance. These will be shared by all tokens of the same type of indexical expression. Elements (i) and (iiib) are supplied by an individual’s knowledge of the token and its production or use. These will vary from one token to another.

Though Burks does not examine the question in detail, it appears that the importance of the individual elements of (i-iii) can vary from one indexical to another. For example, in the case of an utterance of the indexical “I”. someone may fully understand the utterance without knowing the spatiotemporal location of the utterance. (Suppose, for example, you get a phone call from a friend, but you have no idea where your friend is calling from, or that you hear a call of “Help me!” from a voice you recognize, but you cannot tell where the call is coming from.) On Burks’s view, then, it follows that one can understand an utterance of “I” without fully grasping its indexical meaning.

Burks’s suggestion that a complete semantic theory of indexical expressions may require appeal to two distinct kinds of meaning is important. As we see later, David Kaplan’s influential theory of indexicals develops a related suggestion in a systematic way.

e. Objections to Utterance-based Theories

The theories of Reichenbach and Burks (and probably Russell as well) are clear cases of what was called, in the introduction to this section, utterance-based semantic theories of indexicals. There are two influential objections to utterance-based theories. The presentation of the objections will focus on Reichenbach’s theory, because the technical details of Reichenbach’s theory are worked out to a sufficient degree that the force of the objections is most easy to see.

One important objection to utterance-based theories generally is due to David Kaplan. According to Kaplan, utterance-based theories do not provide adequate resources to explain the logical properties of indexicals and demonstratives. According to Kaplan, an adequate semantics for indexicals should explain the logical truth of a sentence like (7):

(7) If today is Monday, then today is Monday.

Yet given an utterance-based semantics, it is unclear how to do so. On Reichenbach’s analysis of indexicals, let u be some utterance of (7), and let t₁ and t₂ be the two tokens of “today” produced (or used) in u. According to Reichenbach, the truth conditions for u are given by (8):

(8) If the day on which t₁ is produced is Monday, then the day on which t₂ is produced is Monday.

Not only is (8) not logically true, it could even be false. Suppose that u were performed right around midnight, slowly enough that t₁ was produced at 11:59 PM on Monday, and t₂ at 12:01 AM on Tuesday. In this case, (8) is false. The same problem arises for the argument

(9) Today is Monday; therefore, today is Monday.

This looks like it should be a valid argument—it appears to have the form p; therefore p. Yet there are utterances of it on which the utterance of the premise is true, while the utterance of the conclusion is false.

A separate problem for utterance-based theories is that a semantic theory for a language should provide an interpretation of every sentence of the language. Yet on utterance-based theories such as Reichenbach’s, sentences containing indexicals receive an interpretation only upon being uttered. In the absence of an utterance of a sentence, Reichenbach’s theory offers no interpretation of it. Given the recursive structure of language, there are sentences that are too long to be uttered by any individual, and hence sentences that never receive any interpretation on Reichenbach’s theory. (For a discussion of and response to both of these objections to utterance-based theories of indexicals, see García-Carpintero.)

3. Kaplan’s Semantic Theory of Indexicals

We now turn to Kaplan’s influential theory of indexicals. Unlike the theories introduced in the previous section, Kaplan’s is an expression-based semantic theory. Kaplan does not take the objects of semantic evaluation to be utterances or tokens. Rather, Kaplan considers the expressions (types) themselves relative to contexts. On Kaplan’s theory, contexts are abstract formal structures that represent certain features of an utterance. As a result, the objects of semantic evaluation on Kaplan’s theory are abstract objects—expressions relative to contexts—rather than concrete physical objects (tokens) or particular events (utterances).

When discussing Kaplan’s theory, one must be careful: there are two different theories attributed to Kaplan on the basis of what he says in “Demonstratives”. We begin by introducing one of these theories. In section 5, when we discuss Kaplan’s logic of demonstratives, we introduce the other theory, and give reasons to prefer the first theory. It is this first theory that we refer to as “Kaplan’s (semantic) theory”.

a. Background and a Basic Insight

Kaplan’s semantic theory of indexicals is embedded in a general picture of the nature of meaning. In order to understand the significance of Kaplan’s theory, it is important to grasp this picture. According to this picture, the meaning of a sentence S—in the sense of the information encoded by S—is a complex, structured entity whose constituents are the meanings of the sub-sentential expressions (words and phrases) that occur in S, and whose structure is determined by the structure of S. This structured entity is called the proposition expressed by S. It is common to represent structured propositions using ordered n-tuples. For example, the sentence “Tally is a dog” expresses the proposition that we can represent using the ordered pair below:

〈BEING A DOG, Tally〉

(It is convenient to talk about ordered pairs, or more generally ordered n-tuples, like this one as being the proposition expressed by “Tally is a dog”, and we will follow this practice. It is important to keep in mind, however, that this is merely a convenience: strictly speaking, a structured proposition is not an n-tuple, and the n-tuple merely represents or stands for the proposition.) The constituents of this proposition are Tally and the property of being a dog. These are the meanings of the significant constituents of the sentence “Tally is a dog”: Tally is the meaning (or referent) of “Tally”, and the property of being a dog is the meaning of the predicate “is a dog”. The structure of the proposition reflects the fact that the sentence “Tally is a dog” is the result of putting the name “Tally” together with the predicate “is a dog”. The sentence “Lassie is a dog” would express a different proposition:

〈BEING A DOG, Lassie〉

It is common to refer to these propositions using the complex “that”-clauses “that Tally is a dog” and “that Lassie is a dog”. respectively. (The “that” in these clauses is not a demonstrative pronoun; it is what linguists call a “complementizer”.)

This picture of propositions as complex structured entities that contain objects and properties as constituents is due originally to Bertrand Russell, from his Principles of Mathematics, and it is currently a subject of significant controversy in the philosophy of language. Kaplan’s semantic theory of indexicals is one of the primary reasons many philosophers today embrace this Russellian picture of propositions.

Kaplan’s main contributions to the semantics of indexicals are (i) the recognition of a distinct kind of meaning, clearest in the case of indexicals like “I”, and (ii) a formal theory that explains how the different kinds of meaning are related to each other and to logic, linguistic competence, and language use. To understand Kaplan’s basic insight, consider two utterances of (10), one utterance by Barack Obama, and the other by Hilary Clinton:

(10) I am flying.

Two observations are immediate here: (i) Obama and Clinton have said or asserted different things—Obama has said of himself that he is flying, while Clinton has said of herself that she is flying—and (ii) the sentence that both Obama and Clinton have uttered means the same thing in both cases. Furthermore, these two observations are related: it is because (10) means what it does, and means the same thing when Obama utters it as it does when Clinton utters it, that Obama and Clinton can each use (10) to say different things.

The traditional notion of a proposition, as captured in the Russellian picture of propositions sketched above, applies to what is said or asserted. On this picture, Obama and Clinton have said or asserted different propositions. So the Russellian picture by itself does not offer any account of the meaning of (10) that remains constant across its different uses. This is where Kaplan’s first contribution comes in.

b. Character, Context, and Content

Kaplan calls the meaning of an expression that stays constant across different contexts of use its character. In Kaplan’s theory, character plays two fundamental roles: (i) the character of “I” is what a competent speaker of English knows in virtue of being competent with “I”; and (ii) the character of an expression is a rule or function whose arguments are contexts, and whose value for any context is what Kaplan calls the content of the expression relative to the context.

The character of “I”, for example, is a function whose value, for any context c, is what Kaplan calls the agent (c_A) of c (the speaker or writer of the context). The agent c_A of a context c is thus the content of “I” relative to c. A language user who is competent with “I” knows this rule, and it is this knowledge, together with information about a context, that allows a language user to figure out who “I” refers to relative to the context.

Generalizing from this example, we arrive at the following theory of meaning: character and content are two different kinds of meaning had by expressions of a language. In virtue of its character, each expression has a content relative to a context. Different kinds of expressions are assigned different kinds of contents relative to contexts. The content of a singular term like “I” relative to a context is an object or individual. The content of an n-place predicate relative to a context is an n-place property or relation. The content of a sentence relative to a context is a structured, Russellian proposition, whose constituents are the contents, relative to the same context, of the atomic expressions (words or phrases) occurring in the sentence.

Some expressions have a character that yields the same content relative to every context. The character of “Barack Obama”. for example, determines the same individual—Barack Obama—relative to every context. Other expressions have a character that yields different contents relative to different contexts. This is the characteristic feature of indexicals, and it is inherited by any expression that contains an indexical. Thus, we may talk not only about the indexicals “I”, “now” and “here”. but also about indexical phrases and sentences. An example of an indexical sentence is (10) (repeated).

(10) I am flying.

In virtue of the character of (10), the content of (10) relative to a context in which Barack Obama is the agent is the structured proposition

〈FLYING, Barack Obama〉,

yet relative to a context in which Hilary Clinton is the agent, the content of (10) is the structured proposition

〈FLYING, Hilary Clinton〉.

These propositions differ in what is contributed, relative to the different contexts, by the indexical “I”. The content of “I” relative to the first context is Barack Obama; the content of “I” relative to the second context is Hilary Clinton.

In addition to an agent c_A to serve as the content of “I”, each context c of Kaplan’s theory includes a time c_T to serve as the content of “now”, a location c_P to serve as the content of “here”, and a possible world c_W to serve as the content of “actually”. Thus, the sentence “I am located here”, relative to a context c, expresses the structured proposition

〈BEING LOCATED AT, 〈c_A, c_P〉〉.

In this case, BEING LOCATED AT is a two-place relation between objects or individuals and locations, and the proposition predicates this relation of the agent and location of the context c. This captures the clear intuition that a speaker who utters “I am located here” says of himself or herself that he or she is at the location of the utterance. Additional parameters may be added to contexts as needed by different indexicals (see the discussion of true demonstratives in section 4), but Kaplan’s original theory focuses on the four above. Thus for most purposes, each context c of Kaplan’s theory can be identified with the quadruple 〈c_A, c_P, c_T, c_W〉.

c. Truth Relative to a Context

Kaplan’s theory also provides the resources for defining truth (and falsehood) for sentences relative to contexts. The underlying, natural idea is that if Saul Kripke utters (11), the sentence, as Saul Kripke has used it, is true in virtue of two facts: (i) relative to the context of Kripke’s use (in which Kripke is the agent), (11) expresses the proposition that Saul Kripke is a philosopher, and (ii) Saul Kripke is a philosopher:

(11) I am a philosopher.

In other words, (11), as Saul Kripke has used it, expresses a proposition that is true at the world in which Saul Kripke has used it (in this case, the actual world).

(To say that a proposition p is true (or false) at a possible world w is just to say that p would be true (false) were w actual. For example, let w be a possible world in which Barack Obama lost to Mitt Romney in the November 2012 presidential election. The proposition that in 2014, Barack Obama is president is false at w, because if w were actual, Barack Obama would not be president.)

Kaplan’s definition of truth (falsehood) for a sentence relative to a context develops this natural idea as follows: a sentence S is true (false) relative to a context c if and only if the content of S relative to c (the proposition expressed by S relative to c) is true (false) at the world c_W of c. Thus, the sentence “I am Saul Kripke” is true relative to any context in which Saul Kripke is the agent, but false relative to any context in which Saul Kripke is not the agent.

There are two features of Kaplan’s definition of truth relative to a context worthy of further attention. The first is that sentences have truth values relative to contexts and worlds. This observation is more general than the definition of truth relative to a context. Given any context c and world w, we can assign a truth value to a sentence S relative to c and w: it is just the truth value at w of the proposition expressed by S relative to c. Because each context c uniquely determines a world c_W (the world of the context), there are two distinct possible world parameters relevant to assigning a truth value to a sentence S—the world c_W of the context c relative to which S expresses a proposition, and the world w at which we evaluate the proposition expressed by S relative to c. This is an example of double indexing, which was recognized before Kaplan’s work as necessary for the treatment of indexicals. (For an early and influential discussion of double indexing for the indexical “now”, see Kamp.)

Double indexing applies not only to sentences but to singular terms and predicates as well. Just as a sentence is assigned a truth value relative to a context and a possible world, so a singular term (either a proper name or a definite description) is assigned a denotation relative to a context and a possible world, and an n-place predicate is assigned an extension (a set of n-tuples) relative to a context and possible world.

The second important feature of Kaplan’s definition of truth relative to a context is that the second possible world parameter is the world of the context. Again, if we focus just on the possible world parameter of a context, this means that the world c_W of the context c is playing two roles in the definition of truth relative to a context c: in one role, it represents the world at which a sentence is uttered or used, and relative to which the sentence expresses a proposition. In the other role, it represents the (actual or counterfactual) circumstance relative to which we evaluate the proposition expressed. This was implicit already in the intuitive statement above of the underlying idea that Kaplan’s definition seeks to capture: (11), as Saul Kripke has used it in the world in which he has used it, expresses a proposition that is true at the world in which he has used it. The two occurrences in the previous sentence of the phrase “the world in which he has used it” reflect the two roles played by the world c_W of the context c in Kaplan’s formal definition of truth relative to c.

One of Kaplan’s most significant philosophical insights was to recognize the difference between these two roles. To help keep these distinct roles clear, Kaplan introduced the phrase circumstance of evaluation to refer to the second role played by the world parameter in the definition of truth for a sentence relative to a context. This allows us to restate Kaplan’s definition as follows: a sentence S is true relative to a context c if and only if the content of S relative to c (the proposition expressed by S relative to c) is true at the circumstance of evaluation c_W determined by c.

(The circumstance of evaluation in Kaplan’s formal definition includes the time c_T of the context as well, but this (i) raises questions about the metaphysics of propositions that are better addressed elsewhere, and (ii) would make the ensuing discussion more complicated without compensatory benefits.)

Again, this feature of Kaplan’s definition of truth relative to a context generalizes to singular terms and predicates. A singular term t denotes an object o relative to a context c and circumstance of evaluation c_W determined by c, if and only if t denotes o relative to c full stop. An n-place predicate P_n has an extension E relative to c if and only if E is the extension of P_n relative to c and the circumstance of evaluation c_W determined by c.

d. Indexicality and Modality

The importance of the distinction between context and circumstance of evaluation is particularly clear when we consider sentences containing both indexicals and modal operators like “necessarily” or “possibly”. On the standard semantic treatment of the modal operators, sentences are true or false only relative to a possible world. A sentence like (12) is true relative to, or at, a world w if and only if there is a possible world w* (accessible from w) such that (13) is true at w*:

(12) Possibly, Barack Obama is president.

(13) Barack Obama is president.

The modal operator “possibly” in (12) serves to shift the possible world parameter of evaluation: the truth of (12) at a world w depends on the truth of (13) at some other world w*. (Strictly, w* could be identical with w, but it need not be.)

When we turn to indexical sentences, however, there are two possible world parameters relative to which such sentences are true or false: the world of the context and the circumstance of evaluation. Which parameter does the modal operator shift?

One way to approach this question is to ask what we mean when we say that a sentence S is true at a possible world w. One thing we could mean by this is that if one were to utter S in w, then one would say something true. On this account, to say that S is true at all possible worlds is to say that no matter what world one was in, if one uttered S in that world, one would say something true. But this is highly implausible. If Robby the Ranger utters (14) in this world, then Robby says something true, because “Yellow-Yellow” refers to a notorious bear that lived in the Adirondacks in the early 2000s:

(14) If Yellow-Yellow exists, then Yellow-Yellow is a bear.

But in another possible world, the name “Yellow-Yellow” might refer to a raccoon. So were Robby to utter (14) in this other possible world, what Robby said would be false. Thus on this view, (14) is not true at every possible world, and hence (15) is false:

(15) Necessarily, if Yellow-Yellow exists, then Yellow-Yellow is a bear.

But most philosophers, persuaded by Kripke, would reject this conclusion: if Yellow-Yellow was a bear, then she was essentially a bear. (See Kripke for a defense of the existence of essential properties.)

There is an alternative interpretation of what we mean when we say that a sentence S is true at a possible world w. On this interpretation, we consider what S says in the actual world (or what someone who uttered S would strictly and literally say), and we evaluate what S says for truth or falsehood at w. More carefully: a sentence S is true at world w if and only if the proposition actually expressed by S is true at w. On this interpretation, then, evaluating “Necessarily, S” requires first determining the proposition actually expressed by S, and then evaluating this proposition at every possible world. This yields the intuitively correct result for the sentence “Necessarily, if Yellow-Yellow exists, then Yellow-Yellow is a bear”. This is true if and only if the proposition actually expressed by “If Yellow-Yellow exists, then Yellow-Yellow is a bear” is true at every possible world. But this proposition is vacuously true at worlds where Yellow-Yellow does not exist, and if Kripke is correct that Yellow-Yellow is essentially a bear, then this proposition is also true at every world where Yellow-Yellow does exist.

As we saw in our discussion of Kaplan’s definition of truth relative to a context, the role of the circumstance of evaluation is to be the world relative to which the proposition expressed by S relative to a context is evaluated. Thus, the intuitive reflections on what we mean when we say that a sentence S is true at a world suggest a clear answer to the question from two paragraphs back: modal operators like “necessarily” and “possibly” shift the circumstance of evaluation, not the world of the context.

This answer garners further support from our intuitions about sentences containing “actually”. Because “actually” is an indexical, its interpretation relative to a context c is determined by the parameters of the context. In the case of “actually”, the relevant parameter is the world c_W of the context. Thus, if modal operators shifted the world of the context, then they would shift the interpretation of the modal indexical “actually”, but intuitively they do not. Kaplan’s famous example of this is (16):

(16) It is possible that in Pakistan, in five years, only those who are actually here now are envied.

In this sentence, “actually” is within the scope of “it is possible that”. So if “it is possible that” shifts the world of the context, then the value of “actually” would be shifted. But it is not. Suppose Kaplan utters both (16) and (17):

(17) Only those who are actually here now are envied.

It is clear that in both cases, Kaplan’s use of “actually” picks out the same world—the world in which he performs both utterances. The only alternative is that the modal operators “possibly” and “necessarily” shift the circumstance of evaluation. More precisely, for any context c and possible world w,

[Necessarily ϕ] is true relative to a context c and world w if and only if, for every possible world w* (accessible from w), ϕ is true relative to c and w*.

One of Kaplan’s central theses about indexicals in English is that there can be no operator that shifts contexts or parameters of contexts in the way that an operator like “necessarily” shifts the circumstance of evaluation. Kaplan calls such operators monsters. The claim that natural language does not include monsters is a matter of debate in current philosophy and linguistics. (For a sophisticated discussion of monsters in linguistics, see Schlenker.)

There is one final observation worth noting before we leave this section: it is important to recognize that “actually” is both an indexical, receiving a value from the context, and a modal operator. As a modal operator, “actually” serves to shift the circumstance of evaluation in the definition of truth relative to a context. But unlike “necessarily” or “possibly”, “actually” always shifts the circumstance of evaluation to the world of the context. Thus on Kaplan’s theory, for any context c and any possible world w, the rule for “actually” is as follows:

[Actually ϕ] is true relative to c and w if and only if ϕ is true relative to c and c_W.

One consequence of this rule is that for any context c and sentence S, if S is true relative to c, then so are both “Actually S” and “Necessarily actually S”. (For more discussion of this consequence, see section 5.)

e. Some Consequences of Kaplan’s Theory of Indexicals

There are several consequences of Kaplan’s theory, as laid out thus far, worth noting:

Indexical singular terms like “I” are directly referential.

A singular term is directly referential if and only if its semantic content relative to a context—what it contributes to the propositions expressed in that context by the sentences in which it occurs—is just the object or individual to which it refers. Thus, it is an immediate consequence of Kaplan’s theory that “I” is directly referential, since the semantic content of an indexical singular term like “I” relative to a context is just the agent of the context. Relative to a context, “I” directly refers to the agent of the context.

The thesis that there are directly referential singular terms is in stark contrast to the Fregean view of language, according to which the content of an expression is always a sense—a mode of presentation of an object, property, or proposition.

Indexical singular terms are rigid designators.

The concept of a rigid designator was introduced into philosophical and semantic discussions by Saul Kripke. According to Kripke, an expression e rigidly designates an object o if and only if e designates o in every possible world in which o exists, and does not designate anything else in any world in which o does not exist. To apply the concept of rigid designation to indexical singular terms, however, we need a definition of rigid designation relative to a context. The following definition is somewhat technical, but it correctly captures Kripke’s notion within a semantics for indexical expressions:

Rigid Designation Relative to a Context:

An expression e rigidly designates an object o relative to a context c if and only if for every possible world w, any predicate F, and any object x distinct from o, if o exists at w, then the proposition expressed relative to c by [e is F] is true at w if and only if, in w, o has the property expressed by F relative to c, and if o does not exist at w it is not the case that the proposition expressed by [e is F] is true at w if and only if, in w, x has the property expressed by F relative to c.

The indexical singular term “I”, for example, is a rigid designator relative to any context c according to this definition. Relative to c, the sentence [I am F] expresses the proposition

〈F-hood, c_A〉.

This proposition is true at an arbitrary world w if and only if c_A has the property F-hood in w. Thus relative to c, “I” rigidly designates c_A.

Note that in the above example, we do not have to specify whether c_A exists at w. This shows that directly referential terms are rigid designators in a particularly strong sense. A directly referential term designates the same object in all possible worlds, whether the object exists at that world or not. (Nathan Salmon, in Reference and Essence, calls such terms obstinately rigid designators.) This is because a directly referential expression contributes the object that it designates to the propositions expressed by sentences in which it occurs; the object is a constituent of the proposition. Any such proposition—one that contains an object or individual as a constituent—is called a singular proposition. Speaking loosely, when we evaluate a singular proposition for truth or falsehood at a possible world w, the singular proposition “brings along” with it the objects that are its constituents. Thus, directly referential terms automatically rigidly designate the objects or individuals to which they refer.

For any definite description [the x: Fx] that uniquely designates an object o relative to a context c, the definite description [the x: Actually Fx] rigidly designates o relative to c.

This consequence of Kaplan’s theory is a corollary of the observations about “actually” at the end of the previous section. Relative to any context c and possible world w, [the x: actually Fx] designates the unique object o that “is F” in the world c_W of c, if o exists in w. This is because of the effect of “actually”, which shifts the circumstance of evaluation to the world of the context. Thus, if [the x: Fx] designates o relative to c, [the x: actually Fx] designates o, relative to c, in every world w in which o exists, and does not designate anything else in any world w in which o does not exist.

This consequence of Kaplan’s theory is significant for one of the classic debates in the philosophy of language: the debate over the meaning of proper names. Ever since Saul Kripke’s Naming and Necessity, philosophers and linguists have recognized that proper names, such as “David Kaplan”, in natural languages such as English are rigid designators. Kripke and others take this semantic feature of proper names to be a major objection to the analysis, inspired by Frege and Russell, of proper names as definite descriptions (in Fregean terms, a definite description gives the sense of a proper name). Suppose we analyze the name “David Kaplan” as the definite description “the author of the most important work on indexicals and demonstratives in the 20th century”. Then in some possible world in which Wittgenstein wrote the most important work on indexicals and demonstratives in the 20th century, the name “David Kaplan” would designate Wittgenstein. Thus on this proposal, the name “David Kaplan” is not a rigid designator.

Some philosophers, however, have responded by modifying the Frege-Russell view: if proper names are analyzed as definite descriptions that have been rigidified by adding “actually”, then Kripke’s observation that proper names are rigid designators is just what we would expect. Other philosophers in turn have rejected this modification on various grounds. (For discussion, see chapter 2 of Soames, Beyond Rigidity.)

4. True Demonstratives

So far, we have discussed Kaplan’s semantic theory of pure indexicals—those expressions whose content is uniquely determined relative to a context by basic features of the context (like the agent, time, location, and world). As we noted in the introduction, however, there are also context-sensitive expressions for which these basic features of context are not sufficient to uniquely determine a content relative to a context. These are the true demonstratives. The paradigm examples are the singular demonstrative pronouns “this” and “that”. Except toward the end of this section, I will focus exclusively on “that”.

a. Two Challenges Posed by True Demonstratives

There are several challenges in spelling out a formal theory of true demonstratives. Two of the most important are (i) how to account, in the theory, for the role of whatever is required in a context (gestures, intentions, and so forth) to fix the reference of a particular use of a demonstrative, and (ii) that distinct occurrences of the same true demonstrative can differ in content relative to the same context.

These challenges are related: on an intuitive level, it is because true demonstratives require some further supplementation from the context that distinct occurrences of the same demonstrative can refer to different things. If I point first at the Washington Monument, and then at the Capitol Building while I utter (18), I have said that the Washington Monument is taller than the Capitol Building, and I have done so because there is something in the context that fixes the reference of my first use of “that” as the Washington Monument, and something in the context that fixes the reference of my second use of “that” as the Capitol Building:

(18) That is taller than that.

These observations about true demonstratives pose a problem for Kaplan’s theory as we have stated it thus far: if the meaning of a demonstrative is its character, and the character of an expression is a function that returns the same content whenever applied to the same context, then there is no way for distinct occurrences of a true demonstrative to differ in content relative to the same context. Any attempt to accommodate true demonstratives into Kaplan’s theory must address this problem.

b. Reference Fixing for True Demonstratives

In order to address the first of the two challenges above posed by true demonstratives—that of how to incorporate into the formal theory whatever is required to fix the reference of a particular use of a demonstrative—we must first determine what in fact fixes the reference of a use of demonstrative. There are many different theories, but most fall into one of two categories: the reference of a particular use of a demonstrative is fixed (i) by an associated gesture, or (ii) by an associated intention.

In “Demonstratives,” Kaplan defends a theory of the first kind. For Kaplan, a demonstration is the way that an object that has been singled out in some way (often, but not always, by an act of pointing) appears or is represented from a particular perspective. Kaplan calls this theory the Fregean Theory of Demonstrations. On the Fregean theory, demonstrations have three qualities in virtue of which they closely resemble (pure) indexical definite descriptions: (i) a demonstration determines a mode of presentation of an object (so that different demonstrations may be demonstrations of the same object), (ii) a particular demonstration d might have picked out a different object from the object that it in fact picks out, and (iii) a particular demonstration d might pick out no object at all (in the case of an illusion or hallucination, for example). The Fregean Theory of Demonstrations provides a natural account of the example above, in which I point at the Washington Monument and at the Capitol Building. In the example, the Washington Monument is singled out visually by my first pointing gesture as the object that I am referring to with my first use of “that”, and the Capitol is singled out visually by my second pointing gesture as the object that I am referring to with my second use of “that”.

One virtue of the Fregean Theory of Demonstrations is that it provides an account of why certain uses of demonstratives are informative, while others are not. This is illustrated by a famous example due to John Perry (in his influential article “Frege on Demonstratives”): suppose that we can see both the bow and stern of the aircraft carrier USS Enterprise in harbor, but the middle of the ship is hidden behind a tall building. Now suppose that I point first at the bow, and then at the stern, while uttering (19):

(19) That is identical to that.

My utterance is informative. But suppose instead I had pointed twice at the bow while uttering (19). My utterance in this case would not be informative. According to the Fregean Theory of Demonstrations, the demonstrations in my second utterance present the USS Enterprise in the same way, yet my demonstrations in my first utterance present the USS Enterprise in two different ways. It may be informative to be told that the object presented in one way is identical to the object presented in another way, but it is not informative to be told that the object presented in one way is identical to the object presented that same way. (Observations like this provide one way that Kaplan can respond to the criticisms discussed below in section 6a.)

One problem with gesture-based views generally is that there are uses of demonstratives that are not associated with any gestures at all. Upon seeing a bright flash through the window, I might ask my wife, “what was that?” without needing to perform any gesture at all. If I perform no gesture, then on any theory according to which the reference of my use of “that” is fixed by my gesture, my use of “that” in this example will not refer to anything. This is the wrong result: my use of “that” clearly refers to the bright flash.

This problem with gesture-based views suggests that an intention-based view is superior. But it is important in proposing or defending an intention-based view that one specifies precisely which intention one thinks is significant for fixing the reference of a use of a demonstrative. A speaker who uses a demonstrative may have several intentions: to point at a particular object o, to refer to o, to refer to the object at which he or she is pointing, and so forth. There may be cases in which these intentions do not single out the same object. For example, I may intend both (i) to refer to an object o, and (ii) to refer to the object at which I am pointing. But if I am in fact pointing at some object o* distinct from o, then these two intentions will determine distinct objects.

Philosophers who argue about different theories of reference-fixing for demonstratives often use such cases as data: suppose theory A says that the reference of a use of a demonstrative is fixed by the speaker’s intention α, and theory B says that the reference of a use of a demonstrative is fixed by the speaker’s intention β. Suppose further that there is some case in which a speaker uses “that,” and in which the speaker’s intention α uniquely determines an object o₁, and the speaker’s intention β uniquely determines an object o₂. Finally, suppose that it is clear in the case in question that the speaker has succeeded in referring with her use of “that” to o₂. This is evidence in favor of theory B over theory A.

In his later essay “Afterthoughts,” Kaplan rejects the Fregean Theory of Demonstrations in favor of a view according to which the reference of a use of a demonstrative is fixed not by a pointing gesture, but by the intention that directs the pointing gesture. Kaplan calls these directing intentions. Thus while on the later Kaplan’s view, the reference of a use of a demonstrative is fixed by an intention, that intention is still associated in some way with a speaker’s gestures: if one chooses not to perform a gesture, then one has no intention to direct a gesture at any individual. As a result, it is unclear whether this view successfully avoids one of the central problems with gesture-based views.

Other intention-based accounts may avoid this problem. According to Kent Bach, for example, the reference of a speaker’s use of “that” is the object determined by the speaker’s referential intention. On Bach’s view, a referential intention has a special reflexive structure: a speaker intends the audience to identify, and to take themselves to be intended to identify, some object or individual as the object the speaker is referring to by thinking of that object in a particular way. If the speaker performs some kind of pointing gesture, then the speaker may intend for the audience to think of the object in question as the object that the speaker is pointing at. In other cases, however, the speaker may intend for the audience to think of the object in question in other ways. (Two classic papers in the debate over demonstrative reference fixing are Marga Reimer’s “Do Demonstrations have Semantic Significance?” and Kent Bach’s “Intentions and Demonstrations”.)

c. Adding True Demonstratives to Kaplan’s Theory

In “Demonstratives,” Kaplan considers two ways of adding demonstratives to his theory. The first requires adding an artificial word, “dthat” to the language, via the following rule: if α is a singular term or definite description, then ⌈dthat [α]⌉ is a singular term. Examples from English include “dthat [the current president of the United States]”, “dthat [Saul Kripke]”, and “dthat[the ice cream cone I ate today]”.

The semantics for “dthat” is such that relative to a context c, the content of ⌈dthat[α]⌉ is the object denoted by α relative to c. For example, relative to a context c such that c_W is the actual world and c_T is noon on January 31, 2013, the content of “dthat[the current president of the United States]” is Barack Obama, because at noon on January 31, 2013 in the actual world, Barack Obama was president. Thus, “dthat”-terms are directly referential, and hence also rigid designators.

“Dthat”-terms exploit the similarity noted above, in our discussion of the Fregean Theory of Demonstrations, between demonstrations and indexical definite descriptions. In this way, this treatment of demonstratives addresses the problem of multiple occurrences of demonstratives by avoiding it altogether. This is because, for Kaplan, an occurrence of a “dthat”-term corresponds to a use of a demonstrative together with a particular type of demonstration, where the singular term α in ⌈dthat[α]⌉ is playing the role of the demonstration. On this treatment of demonstratives, for example, my utterance of (18) (“that is taller than that”), pointing first at the Washington Monument, and then at the Capitol Building, would be represented by (20):

(20) Dthat[the object that appears thus-and-so from here] is taller than dthat[the object that appears so-and-thus from here].

In this case, the definite descriptions “the object that appears thus-and-so from here” and “the object that appears so-and-thus from here” represent my two demonstrations. Thus rather than having two occurrences of the same word or phrase, we have two different phrases altogether.

As a tool for investigating the semantics and logic of directly referential expressions, Kaplan’s “dthat” has been very influential. But as a basis for a semantic theory of the English demonstrative pronoun “that”, “dthat” is inadequate. The primary problem with using “dthat” as a model for the English demonstrative pronoun “that” is that we do not judge ourselves to have used two different phrases when we utter “that is taller than that” while pointing at two distinct objects. Yet if the English demonstrative pronoun “that” functioned like Kaplan’s “dthat”, we would have to say that each use of “that” in an utterance of “that is taller than that” is in fact an utterance of a distinct phrase that in some way combines the word “that” with either the pointing gestures performed or some particular intention. This runs counter to our clear intuition that we are using the same word twice to refer to different things. (See Salmon, “Demonstrating and Necessity”.)

The second way that Kaplan considers adding true demonstratives to his theory requires adding an infinite (or sufficiently large) number of distinct subscripted “that”s: “that₁”, “that₂”, and so forth. Each of these is treated as a distinct word in the language. Then we add to each context c an infinite (or sufficiently long) sequence c_D of objects and individuals. Each subscripted “that” is then assigned its own character: for each i, the content of “that_i” relative to a context c is the i-th member of the sequence c_D. We will call the members of c_D the demonstrata of c. For example, let c be the following context:

〈Saul Kripke, Washington DC, August 4^th, 2014, @ (the actual world), 〈the Washington Monument, the Capitol Building,…〉〉

In other words, Saul Kripke is the agent c_A, Washington DC is the location c_P, August 4^th, 2014 is the time c_T, the actual world is the world c_W, and the sequence

〈the Washington Monument, the Capitol Building,…〉

is the sequence c_D of demonstrata of c. Relative to c, the sentence “that₁ is taller than that₂” expresses the structured proposition

〈TALLER-THAN, 〈the Washington Monument, the Capitol Building〉〉.

This second treatment of demonstratives also avoids the problem of multiple occurrences, because in place of two occurrences of one demonstrative “that”, this theory has occurrences of two distinct terms: “that₁” and “that₂”. As a result, this theory is subject to an objection similar to that raised above to the treatment of demonstratives using “dthat”: it flies in the face of basic intuitions about the language. One apparently basic feature of English is that it contains a demonstrative pronoun “that” which can be used multiple times to refer to distinct objects. This is inconsistent with the claim that instead of a single demonstrative pronoun “that” there are infinitely many distinct subscripted pronouns “that₁”, “that₂”, and so forth.

d. David Braun’s Context-Shifting Semantics for True Demonstratives

(This section is more technical than the preceding.) An influential alternative to Kaplan’s two approaches to demonstratives is David Braun’s context-shifting theory of demonstratives. According to this theory, formal contexts include sequences of demonstrata (as on the second of Kaplan’s theories considered above), but formal contexts also include that Braun calls a focal demonstratum. The focal demonstratum of a context is simply one member of the sequence of demonstrata. An example of a formal context on Braun’s view would be

〈Saul Kripke, Washington DC, August 4^th, 2014, @, the Washington Monument, 〈the Washington Monument, the Capitol Building,…〉〉.

This formal context differs from the example above only in that the Washington Monument occurs twice: once as a member of the sequence of demonstrata, and once as the focal demonstratum.

Braun then proposes that the meaning of a demonstrative “that” has two parts. One of these parts is its character. For Braun, the character of “that” is a function that for any context c returns the focal demonstratum of c. The second part of the meaning of “that” is a function that shifts the context in a systematic way: on Braun’s view, the result of applying this function to a context c whose focal demonstratum is the i-th member of the sequence of demonstrata is a context whose focal demonstratum is the i+1-th member of the sequence of demonstrata.

So on Braun’s view, the demonstrative “that” is associated with two functions, each of which applies to the formal contexts of the semantic theory, but which yield very different outputs. The character of “that”, which we can abbreviate as “ch_that”, is a function that when applied to a context returns a particular parameter of that context—the focal demonstratum. The shifting function of “that”, which we can abbreviate as “sh_that”, is a function that when applied to a context returns another context. Evaluating an occurrence of “that” relative to a context c thus involves two steps: in the first step, we apply the character ch_that of “that” to c, to yield the content of the occurrence; in the second step, we apply the shifting function to c, to yield a new context sh_that(c). The next occurrence of “that” (if there is one) is then evaluated relative to the new context sh_that(c).

Thus on Braun’s view, the proposition expressed by a sentence like (18) (reproduced below) relative to a context c is the proposition that follows it:

(18) That is taller than that

〈TALLER-THAN, 〈ch_that(c), ch_that(sh_that(c))〉〉

On this proposal, the content of the first occurrence of “that” in (18) relative to c is just ch_that(c)—the result of applying the character of “that” to c. The evaluation of the first occurrence of “that” in (18) then triggers the application of the shifting function. Thus the content of the second occurrence of “that” in (18) relative to c is ch_that(sh_that(c))—the result of applying the character of “that” to the context that results from applying the shifting function of “that” to c. The difference between a context c and sh_that(c) is just a difference in the focal demonstratum. But the character of “that” (ch_that) is a function that maps a context c to the focal demonstratum of c. Thus on Braun’s view, ch_that(c) is the focal demonstratum of c, and ch_that(sh_that(c)) is the focal demonstratum of sh_that(c). Thus on Braun’s view, the content of (18) relative to c is the proposition that predicates the relation TALLER-THAN of the focal demonstratum of c and the focal demonstratum of the result of applying the shifting function to c (in that order).

An example will help to clarify the significance of Braun’s view. Suppose that c is the context

〈Saul Kripke, Washington DC, August 4^th, 2014, @, the Washington Monument, 〈the Washington Monument, the Capitol Building,…〉〉,

where the Washington Monument is the focal demonstratum. Then sh_that(c) is the context

〈Saul Kripke, Washington DC, August 4^th, 2014, @, the Capitol Building, 〈the Washington Monument, the Capitol Building,…〉〉,

where the Capitol Building is the focal demonstratum. Now the proposition expressed by (18) relative to c is

〈TALLER-THAN, 〈the Washington Monument, the Capitol Building〉〉.

But (i) this is just the result that we want, and (ii) we have achieved this result without abandoning the idea that the meaning of an indexical or demonstrative is its character—a function from contexts to contents.

5. Kaplan’s Logic of Indexicals

In addition to the semantic theory for indexicals and demonstratives discussed above, Kaplan provides an account of the logical properties of indexicals. Kaplan’s logic has been just as influential as his semantic theory. Section 5a sketches the core idea of Kaplan’s logic in an informal way, and discusses two examples of logical truths in Kaplan’s system that have been the focus of some philosophical debate. Section 5b introduces the second semantic theory of indexicals attributed to Kaplan (see the introduction to section 3), and briefly discusses reasons most philosophers prefer the semantic theory introduced in section 3.

a. The Core Idea of Kaplan’s Logic

The core of Kaplan’s logic is the idea that a sentence containing indexicals is logically true if and only if the rules governing the meanings of its indexicals, plus the rules for the logical connectives, ensure that the sentence is true in every possible context, independently of the meanings of the non-logical expressions that occur in the sentence. A simple example is (21).

(21) If I am fond of dogs, then I am fond of dogs.

Since, in every context, the character of “I” will return the same individual in both places in the sentence where it occurs, the antecedent and consequent of (21) will have the same truth value in every context, and thus (21) will be true in every context.

A more interesting example is the sentence

(22) I am president if and only if, actually, I am president.

To see that this sentence is true in every context, let us try to construct a context relative to which it is false. In virtue of the semantics for “if and only if”, this would require some context c such that (23) is false relative to c, while (24) is true relative to c (or vice versa):

(23) I am president.

(24) Actually, I am president.

But reflection on the semantics for “Actually” shows that this cannot occur. If (24) is true relative to c, then (by the definition of truth relative to a context), the content of (24) relative to c is true at the circumstance c_W of c. Given the semantics for “Actually” (see section 3d), this is the case if and only if the content of (23) relative to c is true at the circumstance c_W of c. Since “Actually” shifts the circumstance of evaluation to the world of the context, it has no effect if the circumstance of evaluation already is the world of the context. But to say that the content of (23) relative to c is true at the circumstance c_W of c is just to say that (23) is true relative to c. Thus, (24) is true relative to c if and only if (23) is true relative to c, no matter what context we take c to be.

One reason for interest in this example is that while it is a logical truth, the sentence that results from prefacing it with the modal operator “Necessarily” is not a logical truth:

Necessarily (I am president if and only if, actually, I am president).

Relative to a context c, (25) is true if and only if for every world w, (22) is true relative to c and w (because “Necessarily” shifts the circumstance of evaluation). But (22) is true relative to c and w if and only if (23) and (24) are either both true or/and both false relative to c and w. Yet (24) is true relative to c and w if and only if (23) is true relative to c and c_W (because “Actually” shifts the circumstance of evaluation back to the world of the context). Thus the logical truth of (25) turns on the following claim about (23): that its content relative to any context c has the same truth value at every circumstance of evaluation. Let c be a context such that c_A is Barack Obama and c_W is the actual world, and let w be a world in which Barack Obama never ran for president. Then (23) is true relative to c (the content of (23) relative to c is true at the circumstance of evaluation c_W), but (23) is not true relative to c and w (because the content of (23) relative to c is not true at the circumstance of evaluation w).

This result has two related interesting consequences: (i) the rule of necessitation fails in Kaplan’s logic. Necessitation is a rule of inference stating that if ϕ is a theorem of a logical system, then so is “Necessarily ϕ”. Necessitation is a standard rule of inference in modal logic, so its failure in Kaplan’s logic of indexicals is surprising. (ii) There are logical truths in Kaplan’s logic that are not necessarily true. In other words, some logical truths in Kaplan’s logic are contingent.

The significance of the second of these consequences is a matter of debate. Kaplan suggests that examples like (22) are cases of contingent a priori claims: propositions that are knowable a priori but are merely contingent. Yet to argue directly from Kaplan’s example to this conclusion requires the further assumption that logical truths in Kaplan’s logic of indexicals express propositions that are knowable a priori. This is a very important topic in the contemporary philosophy of language. For more discussion, see Soames’ Reference and Description, especially Ch. 4.

Another controversial example of a logical truth in Kaplan’s logic is (26):

(26) I am here now.

Kaplan argues that a logic of indexicals should do justice to the intuition that (26) is, in his words, “universally true”. This is another example of a sentence that is not necessarily true, even if it is true. Wherever Saul Kripke is located, if he utters (26), he says something true, but if Kripke were to utter “It is a necessary truth that I am here now”, he would say something false. He could have been somewhere else.

Yet the status of (26) as a logical truth has proven controversial. The most common objection arises from considering various technologies that we use in communication. Early objections to Kaplan’s claim that (26) is a logical truth pointed to the use of sentences like (27) in recording messages for an answering machine:

(27) I am not here now.

Suppose an individual A records a message on an answering machine for their home phone that says “I am not here now. Please leave your name and phone number, and I will return your call as soon as I can”. Another individual B then calls A’s house when A is not at home, and the answering machine plays A’s recorded message. It does not seem as though there is anything false in A’s message. Yet if (26) is a logical truth, then (27) should be a logical falsehood. Thus, Kaplan’s claim that (26) is a logical truth seems to run afoul of everyday facts about language and communication. For an extended discussion of the significance of examples like this, see Predelli, Contexts.

b. Kaplan’s Other Semantic Theory

The model theory for Kaplan’s logic is a development of the model-theoretic semantics for modal logic introduced by Saul Kripke in the early 1960s. The key insight to Kripke’s semantics for modal logic was the introduction of possible worlds as indices relative to which expressions are assigned extensions (truth values in the case of formulas, objects in the case of singular terms, and sets of ordered n-tuples in the case of n-place predicates). In this framework, the intension of an expression is the function whose value for each possible world is the extension of the expression relative to that possible world. The intension of a sentence, for example, is a function from possible worlds to truth values, where for any world w, the value of the intension of s for the world w is the truth value of s relative to w.

This allows us to assign to each sentence s a set of possible worlds at which s is true. For many philosophers, this set is an obvious and natural candidate for the proposition expressed by s. Extending this idea to Kaplan’s logic, the proposal is that the content of an expression relative to a context is an intension, and hence the proposition expressed by a sentence ϕ relative to a context c (in a model M) is just the set of possible worlds w (and times t, in Kaplan’s formal system), such that

⊨^M_cftw ϕ.

Extending this idea further to Kaplan’s semantic theory for indexicals and demonstratives in English, the proposal is that the proposition expressed relative to a context in which Barack Obama is the agent by the sentence

(31) I am flying

is just the set of possible worlds at which it is true that Barack Obama is flying. Similarly, the intension of the singular term “I” relative to this context is the function whose value for any possible world w is just Barack Obama.

This alternative semantic theory is suggested by some of Kaplan’s remarks in “Demonstratives,” and it is the theory that emerges from the formal semantics for the language LD spelled out above. The difference between this alternative semantic theory and the semantic theory attributed to Kaplan in section 3 is the subject of a great deal of contemporary controversy. At issue is the nature of propositions and meaning: according to the theory attributed to Kaplan in section 3, the proposition expressed by a sentence relative to a context is a structured, complex entity that includes as constituents the meanings, relative to the same context, of the words occurring in the sentence. According to the alternative semantic theory sketched immediately above, the proposition expressed by a sentence relative to a context has no such structure or constituents. It is a set of possible worlds.

One reason to prefer the theory attributed to Kaplan in section 3 is that it is only on this theory that we can distinguish between singular terms that are directly referential, and singular terms that are rigid but not directly referential. On the alternative semantic theory sketched in this section, the content of a term relative to a context is an intension: a function from possible worlds to objects. The intension of a rigid designator relative to a context is just a constant function—a function that for any possible world returns the same object. The intension of a directly referential expression is just the same thing. Thus there is, on this alternative semantic theory, no difference in content between a directly referential expression that refers to an object o and a rigid, but not directly referential expression that refers to o. Thus on this alternative semantic theory, the following two terms have the same content relative to any context:

(32) 3

and

(33) the natural number x such that x(x-1)(x-2) = x + ((x-1) + (x-2))

Since both (32) and (33) rigidly designate the number three, the intension of each relative to a context is just the function that for any possible world returns the number three. Thus this alternative theory effaces two obvious differences between (32) and (33): one is a difference in structure, and the other is an intuitive difference between the fact that (32) serves merely to tag a particular number while the definite description (33) picks out the number three in virtue of a particular property of that number. Both of these differences are preserved in a semantic theory according to which propositions have structure that reflects the structure of the sentences expressing them.

6. Objections to Kaplan’s Semantic Theory and Logic

For the reasons immediately above, most philosophers prefer Kaplan’s semantic theory introduced in section 3 to the theory above based more strictly on Kaplan’s logic. The objections to Kaplan’s semantic theory that follow focus on the theory introduced in section 3.

a. Objections to Direct Reference

One of the consequences of Kaplan’s semantic theory noted in section 3e is that indexicals and demonstratives are directly referential: relative to a context, the content of an indexical is just the object that the indexical refers to relative to that context. As a result, for any context c relative to which two indexicals refer to the same thing, the two indexicals have the same content. It follows that two sentences that differ only in so far as one contains one of these indexicals where the other sentence contains the other indexical will express the same proposition relative to any context relative to which the two indexicals have the same content. Suppose, for example, that Saul Kripke sees an individual in a mirror, and gesturing toward the mirror utters (34) and (35):

(34) I am Saul Kripke.

(35) He is not Saul Kripke.

It turns out, however, that Kripke is in fact seeing himself in the mirror, and so has referred to himself with “he”. Relative to this context, the two sentences that Kripke has uttered express the following propositions:

〈IDENTITY, 〈Saul Kripke, Saul Kripke〉〉

〈NEGATION, 〈IDENTITY, 〈Saul Kripke, Saul Kripke〉〉〉

(Where IDENTITY is the relation of being the same thing as, and NEGATION is the property of being false (or not true).) The second of these propositions is just the negation of the first. Thus relative to the context of Kripke’s utterance, (34) and (35) express contradictory propositions. If Kripke’s utterances are indicative of Kripke’s beliefs, then it appears to follow that Kripke believes these two contradictory propositions. Yet it is implausible to think that Kripke, careful logician that he is, would believe two obviously contradictory propositions such as these. (For the beginning of a response, see section 4b.)

A second objection to the thesis that indexicals and demonstratives are directly referential concerns the use of demonstrative pronouns “this” and “that” in complex noun phrases, such as “that dog chewing on a stick” or “this city”. Such phrases are usually called complex demonstratives. The standard Kaplanian view of complex demonstratives is that relative to a context c, (an occurrence of) the complex demonstrative “that F” refers to the object assigned to (the occurrence of) it by c, provided that the object satisfies the predicate F relative to c. Examples like (36) cast doubt on this consequence of Kaplan’s theory:

(36) Every hiker of the John Muir Trail remembers that day they stood on the summit of Mt. Whitney.

Sentence (36) contains a complex demonstrative “that day they stood on the summit of Mt. Whitney”. According to Kaplan’s theory, the content of this complex demonstrative relative to a context should be a particular day. Yet it is clear that the proposition expressed by (36) does not contain any particular day as a constituent. The proposition is not about any one particular day, but is instead about how each hiker remembers a different day. This is because the complex demonstrative contains a pronoun, “they”, which is bound by the quantifier phrase “every hiker of the John Muir Trail”.

A third objection to the thesis that indexicals and demonstratives are directly referential also concerns complex demonstratives. This objection is based on the observation that there are uses of complex demonstratives where there is no intuitive reference at all. An example is an utterance of (37):

(37) That first wolf that allowed itself to be domesticated did pretty well.

The speaker of this utterance clearly has no particular wolf in mind of which it would be correct to say that the speaker is referring to that wolf. Rather, the speaker is making a general claim to the effect that whatever wolf first allowed itself to be domesticated did pretty well. Thus in this case, there is nothing associated with the speaker’s utterance—no gesture or appropriate intention—that could serve to fix the reference of the demonstrative “that first wolf that allowed itself to be domesticated”. A semantics for demonstratives according to which they are directly referential cannot account for this: if there is nothing to fix the reference of the speaker’s use of the demonstrative, then according to Kaplan’s theory, the speaker’s utterance should be defective, inviting from the audience the question “which wolf are you referring to”? But the speaker’s utterance is not defective. Kaplan’s theory does not have the resources to explain cases like this.

b. Objections to Kaplan’s Treatment of Contexts

A different source of worries about Kaplan’s theory is his treatment of contexts as sequences of parameters. On Kaplan’s view, a context c can be identified with the sequence

〈c_A, c_P, c_T, c_W〉,

where c_A is what Kaplan calls the agent, c_P the location, c_T the time, and c_W the world of the context, respectively. Indexicals have contents relative to these contexts. This treatment of contexts raises a problem for Kaplan’s semantic theory insofar as one of the basic goals of a semantic theory for a language like English is to determine constraints on what speakers of the language can use words and sentences of the language to strictly and literally say. The problem is how to apply Kaplan’s theory to actual and possible uses of language by speakers. Without some rule or principle that assigns formal contexts of Kaplan’s theory to uses of indexicals by speakers, Kaplan’s theory fails to achieve this basic goal of a semantic theory.

An example of such a principle for assigning contexts to uses is what we can call the naïve view of contexts:

For any utterance u of an indexical or sentence containing an indexical, the semantically relevant or semantically appropriate formal context for u is the sequence 〈c_A, c_P, c_T, c_W〉 such that c_A is the speaker of u, c_P is the location at which u occurs, c_T is the time at which u occurs, and c_W is the world in which u occurs.

The naïve view yields the correct results for many uses of sentences containing indexicals. If David Kaplan arrives at a party at 8 PM on January 31, 1973, and utters “I am here now!”, what Kaplan has intuitively said is that he is at the party at the time of his utterance. The naïve view would assign to Kaplan’s utterance of this sentence the context

〈David Kaplan, the location of the party, 8 PM on 1/31/1973, w〉.

Relative to this context, the content of “I am here now” is the proposition that David Kaplan is at the location of the party at 8 PM on January 31, 1973. This is just what we take David Kaplan to have said. In this way, principles like the naïve view bridge the gap between the formal contexts of Kaplan’s semantic theory and actual and possible uses of indexicals in communication.

One problem with the naïve view of contexts is that the nature of utterances is left underspecified. An example discussed earlier in the section on logic will help to illustrate the issue: if someone records “I am not here now” on their answering machine, and someone else later calls and hears this recorded message, what event counts as the utterance? Is it the act of recording the message, or the act of calling and triggering the replay of the message? Or the event of the replay of the message itself? Written messages generate similar questions. The issues raised by written notes and recorded messages are currently a topic of much debate in the philosophy of language. (For an extended discussion and references, see Predelli.)

c. Objections to Kaplan’s Logic

One final objection to Kaplan’s theory focuses on Kaplan’s logic. This objection focuses on Kaplan’s claims about the logical behavior of indexicals and demonstratives. According to Kaplan’s logic, an argument like (38) is valid, because the conclusion is true relative to any context relative to which the premise is true (trivially, since the premise and conclusion are the same):

(38) It is quiet now; therefore, it is quiet now.

Recall (from section 2e) that Kaplan argued against utterance-based theories precisely on the grounds that such theories predicted that arguments like this are invalid, because there are utterances of (38) in which the utterance of the premise is true, but the utterance of the conclusion false (if it turns suddenly very noisy halfway through the utterance of (38), for example).

Yet some philosophers have argued for precisely the opposite conclusion: that examples like this show that Kaplan’s claims about the logic of indexicals are wrong. A valid argument should provide a kind of epistemic assurance that anyone who uses the argument in reasoning will never be led from truth to falsehood. Yet the example above of the use of (38) appears to show that there are cases in which one who uses (38) in reasoning can be led from truth to falsehood. Thus, (38) does not provide the kind of epistemic assurance that a valid argument should provide.

The effect of this objection is potentially quite radical. If it is correct, then many arguments that at first glance appear to be valid are not valid. Exactly how radical the objection is depends in part on how widespread the phenomenon of indexicality is in natural language. Some philosophers, for example, have argued that quantifier phrases like “every sailor” are context-sensitive in a way very much like traditional indexicals like “I” or “now” are context-sensitive. If this is correct, and the objection currently under consideration holds, then the traditional syllogism (39) is invalid:

(39) Every sailor is human; every human is a mammal; therefore, every sailor is a mammal.

This objection to Kaplan’s logic thus has potentially far-reaching consequences. (For an example of a philosopher who embraces these consequences, see Yagisawa.)

7. Alternatives to Kaplan’s Theory of Indexicals

Recent work on the semantics of indexicals has seen a proliferation of alternatives to Kaplan’s theory. These alternatives usually take one of two forms: (i) theories that reject Kaplan’s appeal to contexts (as formal objects) altogether in favor of a token-reflexive (or utterance-reflexive) semantic treatment of indexicals, and (ii) theories that retain Kaplan’s formal apparatus of contexts and character but propose alternative hypotheses about the meanings of particular indexicals or demonstratives. This section presents the most influential current token-reflexive theory, before turning to a very brief sketch of a handful of alternatives that are within the Kaplanian framework (or something very much like it).

a. John Perry’s Reflexive-Referential Theory

The most developed utterance-based semantics for indexicals is currently John Perry’s “referential-reflexive” theory. The distinguishing feature of Perry’s theory is the suggestion that a single utterance u of a sentence like “I am hungry” is associated with several different kinds of content. Chief among these are (i) the referential content of u, and (ii) the reflexive content of u. In this way, Perry seeks to combine the insights of Reichenbach and Burks and the direct reference semantics of Kaplan into one theory of indexicals.

To illustrate the difference between these two kinds of content, let u be an utterance of “I am hungry” by Saul Kripke. Then according to Perry, the referential content of u is the singular proposition that Saul Kripke is hungry. This is the same content that Kaplan’s theory would assign to the sentence relative to a formal context in which Kripke is the agent. But Perry also recognizes a distinct variety of content expressed by the utterance: the reflexive content of u is the proposition that the speaker of u is hungry. This is (roughly) the content that Reichenbach’s theory would assign to u.

Perry’s theory is based on several subtle distinctions. The first is a distinction between different ways in which the environment of an utterance influences the interpretation of that utterance. Perry calls the environment in which an utterance occurs the context of the utterance, and distinguishes between two roles of context, which he calls “pre-semantic” and “semantic”. (It is important to recognize that Perry’s use of “context” is distinct from Kaplan’s—according to Kaplan, a context is a formal object of a semantic theory; according to Perry, a context is a complex situation that includes an utterance.) Pre-semantic uses of context involve using cues from the context of an utterance to resolve ambiguities, such as knowing whether a speaker who utters

I saw her at the bank

is talking about a financial institution or a riverside, or knowing whether a speaker who utters

I saw her duck under the table

is talking about someone’s attempt to dodge something or instead is talking about someone’s choice of pet. Thus, we use the context of an utterance pre-semantically in order to determine the structure and conventional meanings intended by the speaker of the utterance.

Semantic uses of the context of an utterance include the interpretation of any indexicals uttered in the course of the utterance. Here Perry makes two further distinctions, one between different kinds of features of a context, and one between the different ways indexicals exploit these different features. The first of these is a distinction between what Perry calls narrow context and wide context. The narrow context of an utterance comprises constitutive features of the utterance. Perry takes these to be the agent, time, and location of the utterance. Changing any of these results in a different utterance. The wide context of an utterance is, in effect, everything else that might be relevant to the interpretation of the indexicals uttered in the utterance. One of Perry’s examples of a feature of wide context is the length of the space between a speaker’s hands when a speaker utters

It was yea big.

This is a feature of the context of the utterance that could be changed without resulting in a different utterance. Thus, it is not a component of the narrow context of the utterance. Features of wide context are thus optional in a way that features of narrow contexts are not: there are utterances in which a speaker does not indicate any length of space between his or her hands, but there is no utterance that does not take place at a certain time.

The second distinction Perry makes in his account of semantic uses of context in the interpretation of indexicals is between kinds of indexicals. According to Perry, some indexicals are such that the referential content of an utterance of them is fixed automatically in the context of the utterance in virtue of their meanings. These are very much like Kaplan’s “pure indexicals”. The least controversial example of an automatic indexical in Perry’s theory is the first-person pronoun “I”. Another plausible candidate is the modal indexical “actually”, any utterance of which automatically picks out the world in which the utterance occurs.

In contrast to automatic indexicals, Perry argues that some indexicals are such that the referential content of an utterance of them is determined in part by certain intentions with which the speaker of the utterance utters them. The clearest examples of such intentional indexicals are Kaplan’s true demonstratives: the demonstrative pronouns “this”, “that”, “these”, “those”, and “there”. But Perry also notes that the referential contents of certain utterances of “now” and “here” are fixed by the speakers’ intentions. An example is the use of “now” in an utterance of

The summers are warmer now than they were ten years ago.

The crucial observation here is that the referential content of “now” is a certain (perhaps not totally determinate) span of time, and the length of this span is determined by the speaker’s intentions.

Yet it is also the case that the referential content of any utterance of “now” is constrained in such a way that the span of time to which the utterance refers must include the time of the utterance: a feature of the narrow context of the utterance. This illustrates how Perry’s two distinctions—narrow versus wide context, and automatic versus intentional indexicals—cross cut each other. The result is a fourfold classification of indexicals:

NA Narrow Context; Automatic Indexical: “I”, “actually”

NI Narrow Context; Intentional Indexical: “now”, “here”

WA Wide Context; Automatic Indexical: “tomorrow”, “yea”

WI Wide Context; Intentional Indexical: “this”, “that dog”, “there”

The referential content of an utterance of an NA indexical is fixed automatically to some feature of the narrow context. The referential content of an utterance of an NI indexical is fixed by the intentions of the speaker of the utterance, but is constrained in some way by some feature of the narrow context. The referential content of an utterance of a WA indexical is fixed automatically to some feature of the wide context of the utterance. Finally, the referential content of an utterance of a WI indexical is fixed by the intentions of the speaker of the utterance, and only constrained, if at all, by whatever features of the wide context are determined by the speaker’s intentions.

The reflexive content of an utterance of an indexical, on Perry’s theory, is roughly the content that encodes what a speaker who is competent with the indexical has to know about the utterance in virtue of which they are in a position to identify the referential content of the utterance. This is captured in the claim that the reflexive content of an utterance u of “I am hungry” is the descriptive proposition that the speaker of u is hungry. Anyone who overhears u is in a position to understand this reflexive content. But only someone who can identify the speaker of u is in a position to grasp the referential content of u. In this way, Perry attempts to revive Burks’s theory of the indexical meaning of a token (or utterance) as a theory of what a competent language user has to know in order to understand that token.

b. Expression-Based Alternatives

Recent work on the semantics of indexicals and demonstratives has led to a proliferation of alternative proposals. Many of these proposals focus on complex demonstratives (see section 6a). The challenge for theories of complex demonstratives is to accommodate both the examples that support Kaplan’s theory—according to which both simple and complex demonstratives are directly referential—and the examples (some of which were presented in section 6a) in which complex demonstratives behave in ways inconsistent with Kaplan’s theory.

One way to meet the challenge of the range of examples is to propose that complex demonstratives are ambiguous. On this view, it is possible to maintain that the examples of complex demonstratives that support Kaplan’s theory are cases of direct reference, while the other examples are cases in which the complex demonstratives in question have a different semantics. One advantage to such a theory is that it preserves the theoretical elegance and intuitive appeal of Kaplan’s treatment of standard referential uses of complex demonstratives. One disadvantage of such a theory is that positing an ambiguity is often thought of as a cheap solution to a problem. Thus, any philosopher or linguist who wants to defend an ambiguity theory of this sort has to argue that the ambiguity is well-motivated, and not simply a response to recalcitrant examples. (For further discussion of ambiguity theories, see Georgi, 2012.)

A different way to meet the challenge posed by the range of uses of complex demonstratives is to argue that some set of these uses reveals the true semantic nature of complex demonstratives, and then show how to explain the other uses within the framework of the proposed semantics. Two recent proposals along these lines are due to Jeffrey C. King and to David Braun. According to King, the uses of complex demonstratives discussed in section 6a show that complex demonstratives are not directly referential at all. On King’s view, complex demonstratives are quantifiers, like “every dog”, or “some homemade cookie”. The key semantic feature of quantifiers is that their content, relative to a context, is not an object or individual. Rather, the content of a quantifier relative to a context is itself a structured, complex entity, whose components are the contents of the expressions that occur within the quantifier. King defends an elaborate theory of the quantificational meanings of complex demonstratives, and shows how this theory accommodates a wide range of linguistic data.

In contrast to King’s theory, David Braun defends a traditional Kaplanian treatment of complex demonstratives as directly referential. On Braun’s view, the uses of complex demonstratives in section 6a that appear to conflict with Kaplan’s theory can be explained on pragmatic grounds: they are cases in which what a speaker means goes above and beyond what the speaker strictly and literally says. This allows Braun to maintain Kaplan’s theory, and to explain the apparently conflicting data, while rejecting the claim that complex demonstratives are ambiguous.

8. References and Further Reading

Bach, Kent. 1992. “Intentions and Demonstrations.” Analysis, 52(3): 140–146.
- Bach defends the view that demonstrative reference is fixed by the speaker’s referential intentions.
Braun, David. 1996. “Demonstratives and Their Linguistic Meanings.” Noûs, 30(2): 145–173.
- This paper is the source of the influential context-shifting semantic theory of demonstratives.
Braun, David. 2008. “Complex Demonstratives and Their Singular Contents.” Linguistics and Philosophy, 31: 57–99.
- Braun defends a direct reference semantics for complex demonstratives from the objections raised by Jeff King and others.
Burks, Arthur W. 1949. “Icon, Index, and Symbol.” Philosophy and Phenomenological Research, 9(4): 673–689.
- Burks provides both an insightful discussion of Peirce’s original remarks on indexicals and a sophisticated theory of their meaning.
García-Carpintero, Manuel. 1998. “Indexicals as Token-Reflexives.” Mind, 107: 529–563.
- García-Carpintero presents a careful analysis of token-reflexive views of indexicals and defends them from several influential objections.
Georgi, Geoff. 2012. “Reference and Ambiguity in Complex Demonstratives.” In William P. Kabasenche, Michael O’Rourke, and Matthew H. Slater (eds), Reference and Referring: Topics in Contemporary Philosophy, v.10. Cambridge, MA: MIT Press, pp. 357–384.
- Georgi defends a view of complex demonstratives according to which they are ambiguous between referential and non-referential readings.
Kamp, Hans. 1971. “Formal Properties of ‘Now.’” Theoria, 37: 237–273.
- Kamp’s paper is an early and influential discussion of double-indexing, or two-dimensional semantics, as applied to natural languages.
Kaplan, David. 1989a. “Demonstratives.” In Joseph Almog, John Perry, and Howard Wettstein (eds), Themes from Kaplan. New York: Oxford University Press, pp. 481–563.
- Kaplan’s most influential work on demonstratives. Its subtitle says it all: “an essay on the semantics, logic, metaphysics, and epistemology of demonstratives and other indexicals.”
Kaplan, David. 1989b. “Afterthoughts.” In Joseph Almog, John Perry, and Howard Wettstein (eds), Themes from Kaplan. New York: Oxford University Press, pp. 565–614.
- Kaplan provides further reflection on some of the main themes of “Demonstratives.”
King, Jeffrey C. 2001. Complex Demonstratives. Cambridge, MA: MIT Press.
- King presents several powerful criticisms of Kaplan’s direct reference semantics for complex demonstratives, and defends an alternative semantic theory according to which complex demonstratives are context-sensitive quantifiers.
Kripke, Saul. 1980. Naming and Necessity. Cambridge, MA: Harvard University Press.
- Kripke argues against descriptivist theories of the meaning and reference of proper names and natural kind terms, along the way introducing the definition of rigid designation.
Perry, John. 1977. “Frege on Demonstratives.” The Philosophical Review, 86(4): 474–497.
- Perry argues that indexicals and demonstratives pose a puzzle for the Fregean theory of meaning as sense.
Perry, John. 1979. “The Problem of the Essential Indexical.” Noûs, 13(1): 3–21.
- Perry offers several influential examples in support of the view that indexicals play a privileged role in epistemology.
Perry, John. 2001. Reference and Reflexivity. Stanford, CA: CSLI Publications.
- Perry presents a sophisticated token-reflexive alternative to Kaplan’s semantic theory that contains many insights into the behavior of indexicals.
Predelli, Stefano. 2005. Contexts. Oxford: Clarendon Press.
- Predelli investigates the philosophical foundations of the second kind of semantic theory attributed to Kaplan.
Reichenbach, Hans. 1947. Elements of Symbolic Logic. New York: Macmillan.
- The book contains the original statement of the view that indexicals are token-reflexives.
Reimer, Marga. “Do Demonstrations Have Semantic Significance?” Analysis, 51(4): 177–183.
- Reimer argues that the reference of a use of a demonstrative is fixed by the associated demonstration (or gesture).
Salmon, Nathan. 2002. “Demonstrating and Necessity.” The Philosophical Review, 111(4): 497–537.
- Salmon argues for an alternative treatment of demonstratives within Kaplan’s semantic framework, according to which demonstrations are included in the context.
Salmon, Nathan. 2005. Reference and Essence, 2^nd Edition. Amherst, NY: Prometheus Books.
- Salmon investigates the relationship between the theory of direct reference in semantics and essentialism in metaphysics.
Schlenker, Philippe. 2003. “A Plea for Monsters.” Linguistics and Philosophy, 26: 29–120.
- Schlenker presents data supporting the existence of monsters in natural language—specifically in propositional attitude reports—and offers a semantic theory that accommodates these data.
Soames, Scott. 2002. Beyond Rigidity. Oxford: Oxford University Press.
- Soames attempts to consolidate the lessons of Kripke’s Naming and Necessity. Chapter 2 includes a sophisticated discussion of rigid designation and its significance for Kripke’s arguments against descriptivism about proper names.
Soames, Scott. 2005. Reference and Description. Princeton: Princeton University Press.
- Soames presents an analysis and criticism of the approach to meaning called “two-dimensional semantics”, including a careful discussion of Kaplan’s logic and semantics of indexicals.
Yagisawa, Takashi. 1993. “Logic Purified.” Noûs, 27(4): 470–486.
- Yagisawa applies the techniques of Kaplan’s logic to more natural uses of language.

Author Information

Geoff Georgi
Email: Geoff.Georgi@mail.wvu.edu
West Virginia University
U. S. A.

The Yablo Paradox

The Yablo Paradox implies there is no way to coherently assign a truth value to any of the sentences in the countably infinite sequence of sentences, each of the form, “All of the subsequent sentences are false.” Specifically, the Yablo Paradox arises when we consider the following infinite sequence of sentences:

$$\begin{aligned} S_{1}: ~&\text{For all } m > 1, S_{m} \text{ is false.}\\ S_{2}: ~&\text{For all } m > 2, S_{m} \text{ is false.}\\ S_{3}: ~&\text{For all } m > 3, S_{m} \text{ is false.}\\ & ~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots\\ S_{n}: ~&\text{For all } m > n, S_{m} \text{ is false.}\\ & ~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots\end{aligned}$$

The Yablo sequence of sentences seem paradoxical in the same sense as the more well-known Liar Paradox:
$$!\lambda: \lambda \text{ is false.}$$
which, when combined with the relevant instance of Tarski’s T-schema:

$$\begin{aligned} TS: &~\text{For any sentence } \Phi:\\ & \ulcorner \Phi\urcorner \text{ is True } \leftrightarrow \Phi \end{aligned}$$

(where $\ulcorner \Phi \urcorner$ is result of applying any standard naming device—for example, Gödel coding—to the expression $\Phi$) entails a contradiction. We can show that the Yablo sequence entails a contradiction in a similar manner via the following informal argument:

“$\bot$” is the symbol for a contradictory sentence. If this were all there was to the Yablo Paradox—just another example of a semantic paradox that can be generated by sentences that predicate truth of themselves or of each other—then it would not be of much philosophical interest. But the Yablo Paradox has a number of properties not shared by its predecessors such as the venerated (or vilified?) Liar Paradox. In particular, the Yablo Paradox seems, at least at first glance, to show that semantic paradoxes do not require the sort of circularity found in the Liar Paradox (and usually taken to be a necessary condition for the presence of such paradoxes, at least until 1993).

This article carefully examines a number of aspects of the Yablo Paradox, including whether the paradox involves circularity in some disguised form, as is claimed by Graham Priest (Priest 1997) amongst others. In addition, we shall look carefully at the rather special sense in which the Yablo Paradox is paradoxical, and we shall also examine various ways that the infinitely descending structure of the Yablo Paradox can be generalized to arrive at other, at least apparently non-circular, versions of well-known paradoxes.

Origins of the Paradox
Paradoxical or Not?
Circular or Not?
Generalizing the Paradox
References and Further Reading

1. Origins of the Paradox

The Yablo Paradox first appears in an article titled “Truth and Reflection” (Yablo 1985), where Yablo includes the following example:

Paradox Without Self-Reference: Here is an example designed to show that self-reference is not essential to paradox. For each $m \in \omega$, let $\Phi_m$ be $$\forall n > m, \neg \text{T} \ulcorner \Phi_n \urcorner$$ so that each $\Phi_m$ says that every succeeding $\Phi_n$ is untrue. An intuitive argument shows that every one of these sentences is paradoxical. If $\Phi_m$ were true, then given what it says, every succeeding $\Phi_n$ would be untrue; but if so then every $\Phi_n$ after $\Phi_{m+1}$ is untrue, whence $\Phi_{m+1}$ is true after all. If $\Phi_m$ were untrue, then there would be an $n > m$ such that $\Phi_n$ was true; but then by the argument just given $\Phi_{n+1}$ would be both true and untrue. Once again, each $\Phi_m$ is paradoxical in the sense defined above. (Yablo 1985, 340)

Yablo himself did not, at the time, attach much importance to the paradox, since it did not seem to present any new technical difficulties for the two front-running formal theories of truth at the time: Kripke’s approach sketched in “Outline of a Theory of Truth” (Kripke 1975) and the Gupta/Herzberger revision theory given in Gupta’s “Truth and Paradox” (Gupta 1982) and Herzberger’s “Notes on Naïve Semantics” (Herzberger 1982a) and “Naïve Semantics and the Liar Paradox” (Herzberger 1982b). Nevertheless, due in part to prodding by Kripke (for more on Kripke’s involvement, see (Cook 2014), Chapter 1), Yablo eventually realized that the paradox presented a serious philosophical challenge to any account (including, arguably, both the Kripke and the Gupta/Herzberger approaches) that diagnosed semantic paradoxes in terms of circularity, regardless of whether the formal apparatus built upon that philosophical account handled the Yablo construction adequately as a matter of ‘logical luck’. As a result, Yablo published a short standalone version of the paradox (Yablo 1993), and the cottage industry of work on this puzzle immediately began to boom.

Kripke’s involvement is no accident, since he raised at least the possibility of paradoxes based on non-circular, but infinitely descending structures in (Kripke 1975):

One surprise to me was the fact that the orthodox approach by no means obviously guarantees groundedness… Even if unrestricted truth definitions are in question, standard theorems easily allow us to construct a descending chain of first-order languages $L_0$, $L_1$, $L_2$,… such that $L_i$ contains a truth predicate for $L_{i+1}$. I don’t know whether such a chain can engender ungrounded sentences, or even quite how to state the problem here; some substantial technical questions in this area are yet to be resolved. (Kripke 1975, 698)

Importantly, Kripke is not considering an infinitely descending sequence of sentences within a single language with a single truth predicate—that is, he is not asking about the construction at issue in the Yablo paradox proper—but is instead wondering whether similar non-circular paradoxes can be constructed within an infinitely descending sequence of Tarskian-style metalanguages, each of which contains a truth predicate for the language ‘below’ it. An affirmative answer to this distinct, but obviously intimately related question was provided by Albert Visser, independently of Yablo’s work, in “Semantics and the Liar Paradox” (Visser 2002), which was originally published in 1989 (after Yablo’s original version, but before the well-known 1993 publication of it).

Although Yablo was unaware of Kripke’s prescient comments (and of Visser’s independent work) when constructing his version of the paradox, he later characterizes the importance of the paradox in terms of its consequences for a Tarskian account of truth based on a hierarchy of metalanguages:

Are the semantic and set-theoretic paradoxes circularity-based? This has been for a long time the dominant view. It shows up in the frequently heard claims that one sure way to avoid the semantic paradoxes is to insist with Tarski on a rigid separation of object language from meta-language, and one sure way to avoid the set theoretic paradoxes is to insist with Russell on a rigid hierarchy of types.

But these claims are open to question, especially the first. Tarskian strictures may block the Liar paradox but they do not block all paradoxes of the Liar type. An example is what we can call the $\omega$-Liar [the Yablo paradox]. … So the Tarskian way of avoiding paradox relies on more than a rigid object/metalanguage distinction. It is also required that the sequence of languages eventually grounds out in a bottom-level object language. (Yablo 2006, 166–167)

In short, the Yablo Paradox highlights the fact that accounts of, or solutions to, the semantic paradoxes cannot make do merely with ruling out (either philosophically or mathematically) problematic circular constructions. In addition, any such account must rule out, or at least account for, the similar puzzles that arise due to infinitely descending non-circular constructions. The standard formulation of the Tarskian account, with its insistence that a rigid object-language/metalanguage distinction eliminates semantic paradoxes, is but one of many targets, since it is susceptible to Yablo-style (better: Kripke/Visser-style) constructions unless supplemented with the requirement that the hierarchy of metalanguages is itself grounded.

This concludes our tour of the origins, and initial importance, of the Yablo Paradox. But much of this is, as is often the case in philosophy, more complicated, and more controversial than it initially appears to be. In particular, both the claim that the Yablo Paradox is genuinely non-circular, and the claim that it is a paradox in the first place (at least, in the same sense that the Liar Paradox is a paradox) can and have been challenged. In the next two sections we shall examine these claims, and we shall then, in § 4, see how to construct a truly non-circular paradox.

Before doing so, it is worth noting that we shall concentrate on Yablo’s version of the paradox, rather than on the Kripke/Visser variant, since Yablo’s version is both simpler and more well-known. All of the points made below can be reformulated so that they apply to the Kripke/Visser construction, however.

2. Paradoxical or Not?

In this section we shall examine the claim that the Yablo Paradox is, in fact, a paradox at all. One recent work on paradoxes defines paradoxes as follows:

A paradox (or apoira) is a type of argument. In particular, a paradox is an argument that:

Begins with premises that seem uncontroversially true.

Proceeds via reasoning that seems uncontroversially valid.

Arrives at a conclusion that is a contradiction, is false, or is otherwise absurd, inappropriate, or unacceptable.

(Cook 2012, 9–10)

Given this understanding of the nature of paradoxes, it seems like the Yablo Paradox is a paradigm instance; after all, doesn’t the argument given in the introduction of this essay amount to a proof of an absurdity (a contradiction, even) based on premises that seem uncontroversially true (or, at the very least, would seem uncontroversially true, did we not already have doubts about truth stemming from the earlier Liar paradox), and using reasoning that seems uncontroversially valid?

Well, “yes” and “no”. While that informal argument does show something, it doesn’t show that we can derive a contradiction, within standard first-order logic, from the infinite collection of sentences that are presumably meant to be the source of the paradox—the derivation given above ‘slips’ in additional resources that are worth identifying and assessing a bit more carefully.

In order to do this, we need to be a bit clearer and more explicit about both the construction of the infinite sequence of sentences that are involved in the Yablo Paradox, and about the exact details of whatever proof is supposed to provide the contradiction. Such clarity can be achieved by moving from the informal context within which the discussion has been situated to a precise construction of the paradox within first-order logic supplemented with a two-place satisfaction predicate $Sat(x, y)$, where $Sat(x, y)$ is meant to hold of two numbers $\alpha$ and $\beta$ if and only if $\alpha$ is the Gödel code of a predicate “$\Phi(y)$” (i.e. $\alpha = \ulcorner \Phi(y) \urcorner$) and that predicate holds of $\beta$ (i.e. “$\Phi(\beta)$” is true). In short, for any predicate $\Phi(y)$, we have:

$$Sat(\ulcorner \Phi(y) \urcorner, \beta) \leftrightarrow \Phi(\beta)$$

We then obtain the Yablo Paradox by applying the binary version of Gödel’s diagonalization lemma:

Diag: For any binary relation symbol $\Phi(x, y)$ there is a unary predicate $\Psi(x)$ such that:

$$(\forall x)(\Psi(x) \leftrightarrow \Phi(\ulcorner \Psi(x) \urcorner, x)$$

is a theorem of arithmetic.

to the following predicate:

$$(\forall n)(n > y \rightarrow \neg Sat(x, n))$$

to obtain what (Ketland 2005) calls the Uniform Fixed-Point Yablo Principle:

$$UFYP: ~ (\forall z)(Y(z) \leftrightarrow (\forall n)(n > z \rightarrow \neg Sat(\ulcorner Y(x)\urcorner, n)))$$

We then obtain the infinite sequence of sentences involved in the Yablo Paradox by considering the $\omega$-sequence of instances of the $UFYP$:

$$\begin{aligned} &Y(1) \leftrightarrow (\forall m)(m > 1 \rightarrow \neg Sat(\ulcorner Y(x) \urcorner, m))\\&Y(2) \leftrightarrow (\forall m)(m > 2 \rightarrow \neg Sat(\ulcorner Y(x) \urcorner, m))\\&Y(3) \leftrightarrow (\forall m)(m > 3 \rightarrow \neg Sat(\ulcorner Y(x) \urcorner, m))\\&\quad\vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots\\&Y(n) \leftrightarrow (\forall m)(m > n \rightarrow \neg Sat(\ulcorner Y(x) \urcorner, m))\\&\quad \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots \end{aligned}$$

Of course, neither the $UFYP$, nor the infinite sequence of its instances, is paradoxical on its own. Just as we require an instance of Tarski’s T-schema in order to obtain a contradiction from the Liar Paradox, we require some principles governing the satisfaction predicate $Sat(x, y)$ in order to derive a contradiction here. There are two options, which amount to two ways of disambiguating the use of “$\beta$” in the semi-formal principle governing “$Sat(x, y)$” given above. First, we can use the $Y(x)$-Generalized Satisfaction Principle:

$$Y(x)\text{-}GSP: (\forall x)(Sat(\ulcorner Y(x)\urcorner, x) \leftrightarrow Y(x))$$

Alternatively, we could adopt, not the universally quantified formula $Y(x)\text{-}GSP$, but rather its instances (motivated, perhaps, by the thought that the Yablo Paradox itself does not—at least in the informal formulation and proof to contradiction given above—seem to involve anything like the $UFYP$, but instead involves merely the infinite list of sentences that are its instances, and hence should not require the full $Y(x)\text{-}GSP$, but should likewise require solely its instances):

$$\begin{aligned} &Y(1) \leftrightarrow Sat(\ulcorner Y(x) \urcorner, 1)\\&Y(2) \leftrightarrow Sat(\ulcorner Y(x) \urcorner, 2)\\&Y(3) \leftrightarrow Sat(\ulcorner Y(x) \urcorner, 3)\\&\quad \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots \\&Y(n) \leftrightarrow Sat(\ulcorner Y(x) \urcorner, n)\\&\quad \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots \end{aligned}$$

Interestingly, in order to derive a contradiction from these ingredients, it turns out that neither the infinite list of instances of the $UFYP$, nor the infinite list of instances of the $Y(x)\text{-}GSP$, is strong enough, mathematically speaking, to do the work. In order to derive a contradiction, we need the full strength of the universally quantified $UFYP$ and $Y(x)\text{-}GSP$. First, the following derivation shows how to obtain a contradiction from the $UFYP$ and $Y(x)\text{-}GSP$:

Thus, we can prove a contradiction from the $UFYP$ and the $Y(x)\text{-}GSP$. But the following results (with we shall not prove here) show that we cannot replace either of these principles with the infinite list of their instances:

Theorem 2.1. $UFYP$, plus the infinite list of instances of $Y(x)\text{-}GSP$, is consistent.

Proof: See (Ketland 2005) for the proof, and (Ketland 2004) and (Cook 2014) for further discussion. $\square$

Theorem 2.2. $Y(x)\text{-}GSP$, plus the infinite list of instances of $UFYP$, is consistent.

Proof: See (Ketland 2005) for the proof, and (Ketland 2004) and (Cook 2014) for further discussion. $\square$

[A bit more detail for the cognoscenti: Both of these results are consequences, loosely speaking, of the compactness theorem for first-order logic. Neither collection of sentences is satisfiable on the ‘standard’ model, but in both cases we can construct a non-standard model of arithmetic satisfying the principles in question, which (by the soundness theorem) suffices for consistency, and also shows that both infinite collections of formulas, while consistent, are $\omega$-inconsistent. For a further discussion of the philosophical relevance of $\omega$-inconsistency to semantic paradoxes in general and to the Yablo Paradox in particular, the reader is encouraged to consult (Barrio 2010), and for an insightful discussion of the Yablo Paradox in second-order arithmetic, which has no non-standard models, the reader should consult (Picollo 2013).]

The second theorem is of particular interest in the present context, since it tells us that, even in the presence of the relatively strong $Y(x)\text{-}GSP$, we cannot prove a contradiction from the infinite sequence of sentences involved in the Yablo Paradox—at least, we cannot do so within first-order arithmetic supplemented with our binary satisfaction predicate “$Sat(x, y)$”. But, we might ask, what prevents us from doing so? In particular, what prevents the somewhat more informal derivation given in the introduction from being carried out within first-order arithmetic?

The answer, once one knows to look for it, is not difficult to spot. To save the reader inconvenient scrolling, let’s first reproduce the informal proof here:

The offending inference—offending in the sense that it is not valid in first-order arithmetic—is the move from line 7 to line 8. What lines 1–6 of the proof show (if regimented into precise first-order notation) is that, for any standard natural number $n$, we can prove that $S_n$ is false. We cannot move from this—a proof of each instance of the sequence:

$$S_1 \text{ is false}, S_2 \text{ is false}, S_3 \text{ is false}, \dots S_n \text{ is false}, \dots$$

to the desired conclusion:

$$\text{For all }n, S_n \text{ is false.}$$

without knowing that the standard natural numbers are, in fact, all of the natural numbers. One way of achieving such additional deductive power via adding the following infinitary $\omega$-rule to first-order arithmetic: $\omega$-R: For any predicate $\Phi(x)$, if we can prove:

$$\Phi(1), \Phi(2), \Phi(3), \dots, \Phi(n), \dots$$

then we can conclude:

$$(\forall x)(\Phi(x))$$

Adding this rule to first-order arithmetic results in a much stronger (and in fact complete) system of arithmetic, however, and thus far outstrips the resources of standard first-order logic and first-order arithmetic.

In short, the reasoning used in the informal derivation of a contradiction presupposes that we are reasoning unambiguously about the standard, intended model of arithmetic (something ruled out by the Löwenheim-Skolem theorem in first-order contexts). As a result, we now have a better understanding of what, exactly, is involved in the Yablo Paradox, at least if the paradox is a genuine paradox in the sense of the term used in the passage from (Cook 2012) cited above. Either (i) the paradox involves, not merely the infinite sequence of sentences as originally supposed, but the universal generalization that provides this sequence as its instances—that is, the $UFYP$—plus the similarly generalized $Y(x)\text{-}GSP$ (rather than merely its instances), or (ii) the paradox involves deductive resources beyond those found in standard first-order arithmetic (even though each of the infinite sequence of sentences involved in the paradox is formulable in the standard language of first-order arithmetic, supplemented with a binary satisfaction predicate. Either way, the infinitary nature of the Yablo Paradox involves notions that go beyond what is required for its finitary, circular cousins such as the Liar Paradox.

3. Circular or Not?

Our next task is to examine the claim that the Yablo paradox is, in fact, a genuinely non-circular paradox, and thus provides a new species of paradox distinct from paradoxes involving circularity in an essential way, such as the Liar paradox. Although the informal discussion of the Yablo paradox seems to make its non-circularity evident—after all, each sentence only ‘says’ something about sentences below it in the list—we have already seen evidence that a merely informal treatment of the paradox in question can be misleading (for example, in terms of exactly what is required to generate a contradiction). Thus, it is worth examining the claim that the Yablo paradox is genuinely non-circular in a bit more detail as well.

Almost immediately after the well known (Yablo 1993) version of the Yablo paradox appeared in print, Graham Priest formulated an objection to Yablo’s claim that the paradox is, in fact, non-circular. The critical passage is this:

… the paradox concerns a predicate $Y(x)$ of the form:
$$!(\forall k > x)( \neg Sat(\ulcorner Y(z) \urcorner, k))$$ and the fact that

$Y(x)$ = ‘$(\forall k > x)( \neg Sat(\ulcorner Y(z) \urcorner, k)$’

shows that we have a fixed point, $Y(x)$ here, of exactly the same self-referential kind as in the liar paradox. In a nutshell, $Y(x)$ is the predicate ‘no number greater than $x$ satisfies this predicate’. The circularity is now manifest. (Priest 1997, 238, notation changed to match that here)

But is the circularity manifest in the manner suggested by Priest? In order to answer this question, we need to first be a bit clearer about what we mean by circularity in the first place. To begin with, consider the Liar sentence as formulated in arithmetic, which no one doubts is a paradigm instance of the sort of circularity at issue. We obtain the Liar by applying the unary version of the Gödelian diagonalization lemma:

Diag: For any unary predicate $\Phi(x)$ there is a sentence $\Psi$ such that: $\Psi \leftrightarrow \Phi(\ulcorner \Psi \urcorner)$ is a theorem of arithmetic.

to the following predicate: $\neg T(x)$ (i.e. the negation of the truth predicate) to obtain a sentence that is equivalent to the claim that the truth predicate fails to apply to (the Gödel code of) that sentence: $\lambda \leftrightarrow \neg T(\ulcorner \lambda \urcorner)$ How, exactly, should we understand the circularity found in the Liar sentence? The answer is that the Liar sentence is a fixed point of a predicate within arithmetic in the following sense:

A sentence $\Phi$ is a weak sentential fixed point of a unary predicate $\Psi(x)$ within a theory $\mathcal{T}$ if and only if: $\Phi \leftrightarrow \Psi(\ulcorner \Phi \urcorner)$ is a theorem of $\mathcal{T}$.

(The qualifier ‘weak’ is retained from (Cook 2006), where this notion of sentential fixed point is contrasted with a ‘strong’ notion. I do not explore this stronger notion here, however. Similar comments apply to the definition of weak predicate fixed point below.) This sort of analysis of circularity—in terms of the existence of linguistic fixed points—was first explored independently in (Leitgeb 2002) and (Cook 2006). Clearly, the Liar sentence $\lambda$ is a sentential fixed point of the predicate “$\neg T(x)$” within arithmetic supplemented with a truth predicate, and this seems like an adequate diagnosis of the sense in which the Liar sentence is circular. Further, we can formulate an analogous notion of fixed points for predicates as follows:

A unary predicate $\Phi(x)$ is a weak predicate fixed point of a binary predicate $\Psi(y, x)$ within a theory $\mathcal{T}$ if and only if: $$!(\forall x)(\Phi(x) \leftrightarrow \Psi(\ulcorner \Phi(y) \urcorner, x)$$ is a theorem of $\mathcal{T}$.

We can now understand Priest’s point as nothing more than the observation that, within arithmetic supplemented with a binary satisfaction predicate, the ‘Yablo’ predicate “$Y(x)$” is a predicate fixed point of the binary predicate: $$!(\forall n)(n > y \rightarrow \neg Sat(x, n))$$ In fact, the $UFYP$ amounts, in the end, to little more than a report of this fact (and recall that the $UFYP$ is provable using the diagonalization lemma—that is, pure arithmetic—alone, and does not require the T-schema, the $Y(x)\text{-}GSP$, or any other non-arithmetic resources for its proof).

There seems to be little point in arguing with Priest about whether this amounts to a genuine form of circularity (since squabbling about what ‘really’ counts as circular or not threatens to obscure the real issue—understanding the roots of paradox in the first place), or whether the Yablo paradox truly ‘suffers’ from this sort of circularity (since it is a mathematical fact that it does). But there are two observations that we can make.

First, Priest’s observations regarding the circularity of the Yablo paradox are part of a larger argument that all paradoxes involve circularity (or something like it—satisfaction of his inclosure schema, see (Priest 1995)) in an essential way. The problem, however, is that sentential and predicate fixed points seem too prevalent to do much explanatory work. As noted in (Leitgeb 2002), (Cook 2006), it is easy to prove that every sentence in arithmetic (and in any theory at least as strong as arithmetic) is a weak sentential fixed point of some predicate, and every unary predicate is a weak predicate fixed point of some binary predicate.

Obtaining the result for predicate fixed points is straightforward (the analogous result for sentential fixed points is similar, simpler, and left to the reader). Given a unary predicate $\Phi(x)$, let $\Psi(x, y)$ be any arbitrary predicate, and apply the binary version of the Gödelian diagonalization lemma to: $$!\Phi(y) \leftrightarrow \Psi(\ulcorner \Phi(y) \urcorner, x)$$ otaining a unary predicate $\Theta(x)$ such that: $$!(\forall x)(\Theta(x) \leftrightarrow (\Phi(x) \leftrightarrow \Psi(\ulcorner \Phi(y) \urcorner, \ulcorner \Theta(y) \urcorner)))$$ is a theorem. The above formula is equivalent to: $$!(\forall x)(\Phi(x) \leftrightarrow (\Theta(x) \leftrightarrow \Psi(\ulcorner \Phi(y) \urcorner, \ulcorner \Theta(y) \urcorner)))$$ Thus, $\Phi(x)$ is a weak predicate fixed point of: $$!(\Theta(x) \leftrightarrow \Psi(y, \ulcorner \Theta(y) \urcorner)))$$ As a result, although Priest’s claim that the Yablo paradox is circular, since it involves a predicate fixed point, is no doubt correct, it is not clear how useful the observation turns out to be in the end. If every predicate whatsoever in arithmetic (or in any theory at least as strong as arithmetic) is also a predicate fixed point, but the vast majority of such predicates do no give rise to paradoxes, then we are left wondering what special explanatory role this sort of circularity is meant to play in our account of the Yablo paradox.

Of course, the existence of fixed points is not necessarily the only way to understand or identify the presence of circularity. For example, (Leitgeb 2002) and (Yablo 2006) explore an analysis of circular linguistic constructions in terms of the structure of corresponding non-well-founded sets (although see (Cook 2014) for a series of criticisms of the viability of any such account).

Second, although Priest is right that the version of the Yablo paradox that can be constructed within arithmetic supplemented with a binary satisfaction predicate is circular in at least some minimal sense, since it explicitly involves a weak predicate fixed point, there are other ways to construct the paradox that do not seem to involve fixed points of this sort. One such method, developed in (Cook 2006) (and explored further in (Cook 2014)), involves utilizing infinitary conjunction, rather than arithmetic diagonalization, in the construction of the paradox. To construct this version of the paradox, we need a propositional language that contains (at a minimum) a countable infinity of sentence names: $$!S_1, S_2, S_3, \dots S_m, \dots$$ a falsity predicate “$F(x)$”, and (countably) infinitary conjunction ($\wedge$). We can then construct a version of the Yablo paradox stipulating that each sentence name denotes the infinite conjunction of attributions of falsity to each sentence name with higher (finite) index or, a bit more intuitively:

$$\begin{aligned} S_1:&~ F(S_2) \wedge F(S_3) \wedge F(S_4) \wedge \dots \\ S_2: &~F(S_3) \wedge F(S_4) \wedge F(S_5) \wedge \dots \\ S_3: &~F(S_4) \wedge F(S_5) \wedge F(S_6) \wedge \dots \\ &~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots\\ S_m: &~F(S_{m+1}) \wedge F(S_{m+2}) \wedge F(S_{m+3}) \wedge \dots \\ &~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots \end{aligned}$$

This version of the Yablo paradox contains no fixed points of any sort. (See (Cook 2014) for a proof of this fact, as well as for the development of an infinitary propositional logic within which the derivation to a contradiction can be carried out. Both (Hardy 1995) and (Forster 2004) contain earlier explorations of infinitary, non-arithmetic versions of the paradox). Since the important point, of course, is not whether the exact construction Yablo had in mind is circular or not, but rather whether Yablo’s work points in the direction of a truly non-circular paradox, this construction seems to provide good evidence for the claim that there are, in fact, non-circular paradoxes.

A final note, however: In constructing the infinitary conjunction version of the Yablo paradox, we have made a trade-off. In working somewhere other than in formal arithmetic, we have avoided the need to invoke diagonalization, and have as a result blocked Priest’s argument based on the existence of weak fixed points. In order to carry this out successfully, however, we have had to introduce additional, rather substantial resources: infinitary connectives (in particular, infinitary conjunctions, and rules of inference governing such infinitary linguistic expressions). Thus, this version of the paradox is only compelling as an example of a genuinely non-circular paradox insofar as infinitary resources such as infinite conjunctions are legitimate in the first place. Unsurprisingly, perhaps, Priest preemptively objects to exactly this sort of construction:

One might suggest the following. We leave the deduction as just laid out, but construe the $n$ in the reductio part of the argument as schematic, standing for any natural number. This gives us an infinity of proofs, one of $\neg Ts_n$, for each $n$. We may then obtain the conclusion $\forall n \neg Ts_n$ by an application of the $\omega$-rule: $$!\frac{\alpha(0), \alpha(1),\dots}{\forall x \alpha(x)}$$ The rest of the argument is as before. Construing the argument in this way, we do not have to talk of satisfaction. There is no predicate involved, a fortiori no fixed point predicate. We therefore have a paradox without circularity. (Priest 1997, 238–239)

Although Priest is describing a version of the paradox formulated in terms of an arithmetic $\omega$-rule, rather than in terms of infinitary conjunction, it is clear that his objections are intended to generalize to the case at hand:

As a matter of fact, we did not apply the $\omega$-rule [in his earlier sketch of the derivation of a contradiction], and could not have. The reason we know that $\neg Ts_n$ is provable for all $n$ is that we have a uniform proof, i.e. a proof for variable $n$. Moreover, no finite reasoner ever really applies the $\omega$-rule. The only way that they can know there is such a proof of each $\alpha(i)$ is because they have a uniform method of constructing such proofs. And it is this finite information that grounds the conclusion $\forall x \alpha(x)$. (Priest 1997, 239)

In short, Priest’s claim is that we do not, and in fact cannot, proceed via the infinitary methods in question (whether $\omega$-rule or infinitary conjunction): instead, any construction of the paradox and proof of its paradoxically must rely on a uniform finitary method of construction and proof.

We will not attempt to settle this issue here (although the reader is encouraged to consult the discussion of this point in (Cook 2014)). Instead, we will conclude this section by noting that Priest’s claims about the impossibility of actually applying such infinitary reasoning strategies are controversial at best. Whether or not finite human beings (or other reasoners to which we wish our logical and semantic theories to apply) can carry out the sort of countably infinite supertasks required to construct the version of the paradox involving infinite conjunctions remains an open, and controversial, question.

4. Generalizing the Paradox

Once we have one instance of a non-circular paradox in hand, a natural question to ask is whether, and how, the basic idea underlying the Yablo paradox can be generalized. In other words, can we construct (apparently) non-circular variants of other paradoxes, and types of paradox? The answer, of course, is “yes”! We shall not attempt to produce a catalogue of all such generalizations here, however—instead, a handful of particularly interesting examples will be examined.

One of the earliest, and most well-known, variants of the Yablo paradox is Sorensen’s queue paradox. Sorensen provides a rather novel, and entertaining, presentation of this ‘dual’ version of the paradox:

An infinite queue of students receives a lecture on human fallibility. Each student thinks

(Q) Some of the students behind me are now thinking an untruth.

As it happens, each student is thinking just one thought: (Q). Of course, their different positions in the queue ensures that each token of (Q) expresses something different. (Sorensen 1998, 137)

We can formulate a version of this paradox more along the lines of the traditional presentation of the Yablo paradox by noting that each person in the queue is, in essence, thinking that at least one of the thoughts ‘behind’ her is false, obtaining:

$$\begin{aligned} S_{1}: ~&\text{There is an } m > 1 \text{ such that } S_{m} \text{ is false.}\\ S_{2}: ~&\text{There is an } m > 2 \text{ such that } S_{m} \text{ is false.}\\ S_{3}: ~&\text{There is an } m > 3 \text{ such that } S_{m} \text{ is false.}\\ & ~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots\\ S_{n}: ~&\text{There is an } m > n \text{ such that } S_{m} \text{ is false.}\\ & ~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots ~~~~~~ \vdots \end{aligned}$$

This dual version of the Yablo paradox is paradoxical in exactly the same sense as the original version (careful construction of the paradox within arithmetic, and derivation of the contradiction from the dual versions of $UFYP$ and $Y(x)\text{-}GSP$ are left to the reader).

Sorensen’s construction of this dual version of the Yablo paradox opened up two areas of research. The first was the discovery of other patterns of reference that produced paradoxes, including ‘dual’ versions of many familiar semantic paradoxes. For example, (Cook 2002) and (Cook 2004) show that, for a wide class of sentences and sets of sentences, if we replace all conjunctions with disjunctions (and vice versa) and replace all universal quantifiers with existential quantifiers (and vice versa), then the construction that results will be paradoxical if and only if the original sentence or set of sentences was. In short, the connection between Yablo’s paradox and Sorensen’s dual version is just one instance of a much wider phenomenon. The techniques and tools of directed graph theory have been invaluable in the study these patterns of paradox (for details, see (Cook 2002), (Cook 2004), (Rabern, Rabern, & Macauley 2013), and (Cook 2014)).

More important for our purposes, perhaps, is the fact that that Sorensen’s queue paradox, as well as the catalogue of other variants of the Yablo construction he produced or collected, and catalogued, in (Sorensen 1998) led to important results connecting finitary paradoxes such as the Liar paradox and infinitary, Yabloesque constructions. Sorensen characterizes the search for general methods for transforming finitary, circular paradoxes into non-circular Yabloquesque analogues as follows:

There is a wide family of paradoxes that are loosely characterized as self-referential. The simplicity of Yablo’s technique invites the conjecture that all of these paradoxes can be purged of self-reference. The conjecture could be demonstrated if there were a standard formalization of the self-referential puzzles. For one could then formulate an algorithm that mechanically transforms self-referential puzzles into Yabloesque versions. Unfortunately, there is no standard formalization. (Sorensen 1998, 150)

The most important attempts to formulate such a standard, mechanical transformation of this sort are known as unwinding theorems: results which associate with any circular paradox a corresponding non-circular analogue of it—its unwinding—such that the infinitary, non-circular Yabloesque unwinding is paradoxical if and only if the original circular construction is paradoxical. The first unwinding theorems were proven in (Cook 2002) and (Cook 2004), and applied to a propositional language containing infinitary conjunction and a falsity predicate similar to the language within which we constructed the infinitary conjunction version of the Liar paradox in the previous section. Soon after, Philippe Schlenker produced a number of unwinding recipes for arithmetic supplemented with a truth predicate (see (Schlenker 2007a), (Schlenker 2007b)). These results are complex, but a few examples should suffice to get the basic idea across.

On the simplest unwinding recipe given in (Cook 2004), the unwinding of the Liar sentence is just the Yablo paradox. The unwinding of the open pair—that is, the pair of sentences where each sentence asserts the falsity of the other:

$$\begin{aligned} S_1:&~ F(S_2)\\ S_2:&~F(S_1) \end{aligned}$$

is the 2-Yablo chain: an infinite sequence of sentences where each sentence asserts the falsity of every other sentence (starting with the sentence immediately after the sentence in question):

$$\begin{aligned} S_1: &~ F(S_2) \wedge F(S_4) \wedge F(S_6) \wedge F(S_8) \wedge \dots \\ S_2: &~ F(S_3) \wedge F(S_5) \wedge F(S_7) \wedge F(S_9) \wedge \dots \\ S_3: &~ F(S_4) \wedge F(S_6) \wedge F(S_8) \wedge F(S_{10}) \wedge \dots \\ ~~~~~~ \vdots & ~~~~~~ \vdots ~~~~~~ \vdots~~~~~~ \vdots~~~~~~ \vdots~~~~~~ \vdots~~~~~~ \vdots~~~~~~ \vdots\\ S_n: &~ F(S_{n+1}) \wedge F(S_{n+3}) \wedge F(S_{n+5}) \wedge F(S_{n+7}) \dots \\ ~~~~~~ \vdots & ~~~~~~ \vdots~~~~~~ \vdots~~~~~~ \vdots~~~~~~ \vdots~~~~~~ \vdots~~~~~~ \vdots~~~~~~ \vdots \end{aligned}$$

and the unwinding of the Liar ‘triple’:

$$\begin{aligned} S_1:&~ F(S_2)\\ S_2:&~F(S_3)\\ S_3:&~F(S_1) \end{aligned}$$

is the 3-Yablo chain: an infinite sequence of sentences where each sentence asserts the falsity of every third sentence (starting with the sentence immediately after the sentence in question):

$$\begin{aligned} S_1: &~ F(S_2) \wedge F(S_5) \wedge F(S_8) \wedge F(S_{11}) \wedge \dots \\ S_2: &~ F(S_3) \wedge F(S_6) \wedge F(S_9) \wedge F(S_{12}) \wedge \dots \\ S_3: &~ F(S_4) \wedge F(S_7) \wedge F(S_{10}) \wedge F(S_{13}) \wedge \dots \\ ~~~~~~ \vdots & ~~~~~~ \vdots ~~~~~~ \vdots~~~~~~ \vdots~~~~~~ \vdots~~~~~~ \vdots~~~~~~ \vdots~~~~~~ \vdots\\ S_n: &~ F(S_{n+1}) \wedge F(S_{n+4}) \wedge F(S_{n+7}) \wedge F(S_{n+10}) \dots \\ ~~~~~~ \vdots & ~~~~~~ \vdots~~~~~~ \vdots~~~~~~ \vdots~~~~~~ \vdots~~~~~~ \vdots~~~~~~ \vdots~~~~~~ \vdots \end{aligned}$$

Note that the open pair has two acceptable truth-value assignments (either $S_1$ is true and $S_2$ is false or $S_1$ is false and $S_2$ is true) as does the 2-Yablo chain (either the even indexed sentences are true and the odd indexed sentences are false, or the even indexed sentences are false and the odd indexed sentences are true), while both the Liar ‘triple’ and its unwinding, the 3-Yablo chain, are paradoxical.

Unwinding theorems show that we can, in many cases at least, transform a finite, circular construction (whether paradoxical or not) into an infinitely descending, Yabloesque construction with many of the same properties as the original, finite construction. Unfortunately, however, such constructions don’t always deliver what we might take to be the most important desideratum of such constructions: transforming a finite, circular construction into an infinitely descending, non-circular Yabloesque construction. For example, let “Bew(x)” be Gödel’s provability predicate, and consider the Henkin sentence $\Gamma$ obtained by diagonalizing on the provability predicate: $$!\Gamma \leftrightarrow Bew(\ulcorner \Gamma \urcorner)$$ Sentence $\Gamma$ ‘says’ (in effect) that it is provable. If we apply the simplest unwinding recipe found in (Schlenker 2007a), (Schlenker 2007b) to the Henkin sentence, we obtain the Uniform Fixed Point Principle for “$Bew(x)$”: $$!(\forall y)(Y_\Gamma(y) \leftrightarrow (\forall x > y)(Bew(\ulcorner Y_\Gamma (\dot{x}) \urcorner)))$$ which has the following infinite, Yabloesque sequence of sentences as its instances:

$$\begin{aligned} Y_\Gamma(1)& \leftrightarrow (\forall x > 1)(Bew( \ulcorner Y_\Gamma(\dot{x}) \urcorner))\\ Y_\Gamma(2)& \leftrightarrow (\forall x > 2)(Bew(\ulcorner Y_\Gamma(\dot{x}) \urcorner))\\ Y_\Gamma(3)& \leftrightarrow (\forall x > 3)(Bew(\ulcorner Y_\Gamma(\dot{x}) \urcorner))\\ &\vdots ~~~~~~~~ \vdots ~~~~~~~~ \vdots ~~~~~~~~ \vdots\\ Y_\Gamma(n)& \leftrightarrow (\forall x > n)(Bew(\ulcorner Y_\Gamma(\dot{x}) \urcorner))\\ &\vdots ~~~~~~~~ \vdots ~~~~~~~~ \vdots ~~~~~~~~ \vdots \end{aligned}$$

Intuitively, each of these sentences (which can be constructed within first-order Peano arithmetic, since the provability predicate “$Bew(x)$” is definable in arithmetic), ‘says’ that all of the sentences ‘below’ it in the list can be proven in Peano arithmetic.

Unfortunately for anyone hoping that unwinding recipes will always provide a non-circular analogue of the original circular construction with which we started (in the sense of replacing weak sentential fixed points with weak predicate fixed points, at the very least), every sentence in the unwinding of the Henkin sentence is still a weak sentential fixed point of the provability predicate “$Bew(x)$”, as is shown by the following derivation. (The discussion here follows (Cook 2014). similar constructions can also be found in (Cieslinski & Urbaniak 2013) and (Leach-Krouse 2014)):

(Note: This derivation relies on (i) the fact that the provability predicate satisfies a version of the converse Barcan formula (lines 2–3), and (ii) Löb’s theorem (lines 5–6).) Despite this limitation, there remains much of interest, and much yet left to be discovered, in the study of unwindings.

One final generalization of the Yablo paradox is worth mentioning. Recent work on the semantic paradoxes has highlighted the fact that the Curry paradox (and the difficulties with accounting for the material conditional that come with it) is as important—perhaps more important!—than the Liar paradox. The Curry paradox (Curry 1942), which like many other constructions discussed in this chapter, can be constructed via diagonalization, is just a sentence that is equivalent to the claim that, if the sentence in question is true, then some other unrelated sentence is true. Actually, the Curry paradox is an infinite collection of intimately related paradoxes—one for each sentence. Thus, given any sentence $S$, consider: $$!\Xi : T(\ulcorner \Xi \urcorner) \rightarrow S$$ Note that the Curry sentence, like the Liar sentence, involves a weak predicate fixed point. Given (the mere existence, not the truth of) such a sentence $\Xi$, and the relevant instance of the T-schema, we can now prove $S$:

What does this have to do with the Yablo paradox? One way to see the connection is as follows: we can treat negation as defined in terms of the conditional and a primitive absurdity constant “$\bot$”: $$!\neg \Phi =_{df} \Phi \rightarrow \bot$$ Given this definition, the Liar paradox: $\lambda: \neg T(\ulcorner \lambda \urcorner)$ is just the instance of the Curry paradox obtained by substituting the absurdity constant “$\bot$” for $S$: $$!\lambda: T(\ulcorner \lambda \urcorner) \rightarrow \bot$$ Given all of this, it is now apparent that we have (at least) two dimensions that we can vary when considering semantic paradoxes. On the one hand, we can contrast paradoxes that involve negation or falsity (such as the Liar paradox and the Yablo paradox) with paradoxes that involve the conditional (such as the Curry paradox). On the other hand, we can contrast paradoxes that involve weak sentential fixed points (such as both the Liar and Curry) with paradoxes that involve weak predicate fixed points and infinitely descending sequences of sentences (such as the Yablo paradox. As the following chart makes clear, however, on this way of characterizing things, there is a possibility we have not yet explored:

	Negation	Conditional
Sentential Fixed Point (Finite)	Liar Paradox	Curry Paradox
Predicate Fixed Point (Infinite)	Yablo Paradox	???

In short, we have not seen a paradox that involves the conditional, rather than negation, and which also involves the sort of infinitely descending, apparently non-circular construction found in the Yablo paradox. In short, what we want is a Curry-Yablo hybrid, which we shall (following (Cook 2014) call the Yablurry paradox.

Intuitively, it is rather obvious how to to construct such a hybrid. Generalizing the insights found in the discussion of unwindings above, and given an arbitrary sentence $S$, we want each sentence in such a Curry-Yablo hybrid to ‘say’ that the sentences below it entail $S$. There is an ambiguity here, however. We can either have each sentence ‘say’ that, for each sentence below it in the list, if that sentence is true then $S$, or we can have each sentence ‘say’ that, if all the sentences below it are true, then $S$. The first option, first formulated in (Cook 2009a), provides the Yablurry paradox:

$$\begin{aligned} S_1: ~&(\forall m > 1)(T(\ulcorner S_m\urcorner) \rightarrow S)\\ S_2: ~&(\forall m > 2)(T(\ulcorner S_m\urcorner) \rightarrow S)\\ S_3: ~&(\forall m > 3)(T(\ulcorner S_m\urcorner) \rightarrow S)\\ & \vdots ~~~~~~~~ \vdots ~~~~~~~~ \vdots ~~~~~~~~ \vdots \\ S_k: ~&(\forall m > k)(T(\ulcorner S_m\urcorner) \rightarrow S)\\ & \vdots ~~~~~~~~ \vdots ~~~~~~~~ \vdots ~~~~~~~~ \vdots \end{aligned}$$

and the second option, which was first formulated in (Beall 1999), gives us the Dual Yablurry paradox:

$$\begin{aligned} S_1:~ &((\forall m > 1)(T(\ulcorner S_m \urcorner))) \rightarrow S \\ S_2: ~&((\forall m > 2)(T(\ulcorner S_m \urcorner ))) \rightarrow S \\ S_3: ~&((\forall m > 3)(T(\ulcorner S_m \urcorner ))) \rightarrow S \\ & \vdots ~~~~~~~~ \vdots ~~~~~~~~ \vdots ~~~~~~~~ \vdots ~~~~~~~~ \vdots \\ S_k: ~&((\forall m > k)(T(\ulcorner S_m \urcorner ))) \rightarrow S \\ & \vdots ~~~~~~~~ \vdots ~~~~~~~~ \vdots ~~~~~~~~ \vdots ~~~~~~~~ \vdots \end{aligned}$$

The reasons for calling the first formulation the Yablurry paradox, and the second its Dual, are straightforward. Just as we obtain the Liar sentence when we substitute the absurdity constant for $S$ in the Curry paradox (and understand negation to be defined in terms of the conditional and absurdity), we should expect the instance of the Yablurry paradox obtained by replacing $S$ with “$\bot$” to be (or be equivalent to) the Yablo paradox. And this is exactly the result we obtain. Each sentence in the Yablurry paradox is of the following form: $$!S_k: (\forall m > k)(T(\ulcorner S_m\urcorner) \rightarrow S)$$ If we replace $S$ with $\bot$, we obtain: $$!S_k: (\forall m > k)(T(\ulcorner S_m\urcorner) \rightarrow \bot)$$ which then, applying our definition of negation, becomes: $$!S_k: (\forall m > k)(\neg T(\ulcorner S_m\urcorner)$$ This is exactly what we would expect of a Curry-Yablo hybrid—given the substitution of “$\bot$” for $S$, each sentence in the resulting construction ‘says’ that all sentences below it in the list are false (i.e. fail to be true).

If we carry out the same substitution in the Dual Yablurry, however, we do not obtain the Yablo paradox. Each sentence in the Dual Yablurry is of the following form: $$!S_k: ~((\forall m > k)(T(\ulcorner S_m \urcorner ))) \rightarrow S$$ Replacing $S$ with “$\bot$”, we obtain: $$!S_k: ~((\forall m > k)(T(\ulcorner S_m \urcorner ))) \rightarrow \bot$$ Applying the definition of negation in terms of the conditional and “$\bot$”, we get: $$!S_k: ~\neg (\forall m > k)(T(\ulcorner S_m \urcorner ))$$ Which is logically equivalent to: $$!S_k: ~ (\exists m > k)(\neg T(\ulcorner S_m \urcorner ))$$ In other words, the result of substituting “$\bot$” in for $S$ in this construction is Sorensen’s dual version of the Yablo paradox—hence the terminology Dual Yablurry paradox.

There are, of course, many other generalizations or modifications of the basic Yablo pattern in the literature, and many more variants no doubt yet to be discovered. Hopefully, however, the examples above give the reader some idea of the philosophical and mathematical richness of the Yablo paradox and the puzzles and problems connected to it. There remains, of course, much work to be done.

5. References and Further Reading

Barrio, E. (2010), “Theories of Truth without Standard Models and Yablo’s Sequences”, Studia Logica 96: 377-393.
Beall, J. (1999), “Completing Sorensen’s Menu: A Non-modal Yabloesque Curry Paradox”, Mind 108: 737-739.
Beall, J. (2001), “Is Yablo’s Paradox Non-Circular?”, Analysis 61: 176-187.
Beall, J. (ed.) (2003), Liars and Heaps, Oxford: Oxford University Press.
Beall, J. (ed.) (2008a), Revenge of the Liar: New Essays on the Paradox, Oxford, Oxford University Press.
Bolander, T, V. Hendricks, & S. Pedersen (2006), Self-reference, Stanford: CSLI Lecture Notes. 18
Cieslinski, C. & R. Urbaniak (2013),”Gödelizing the Yablo Sequence”, Journal of Philosophical Logic 42: 679-695.
Childers, T. & O. Majer (eds.), (2002), The Logica Yearbook 2002, Prague: Filosophia Press.
Cook, R. (2002), “Patterns of Paradox: Paradox and Parity”, in Childers and Majer (2002): 69-83.
Cook, R. (2004), “Patterns of Paradox”, Journal of Symbolic Logic 69: 767-774.
Cook, R. (2006), “There are Non-Circular Paradoxes (but Yablo’s Isn’t One of Them!)”, The Monist 89: 118-149.
Cook, R. (2009a), “Curry, Yablo, and Duality”, Analysis 69: 506-514.
Cook, R. (2012), Key Concepts in Philosophy: Paradoxes, Polity Press.
Cook, R. (2014), The Yablo Paradox: An Essay on Circularity, Oxford: Oxford University Press.
Curry, H. (1942), “The Inconsistency of Certain Formal Logics”, Journal of Symbolic Logic 7: 155-117.
Forster, T. (2004), “The Significance of Yablo’s Paradox without Self-reference”, Logique et Analyse 47: 461-462.
Gabbay, D. & F. Guenthner (eds.) (2002), The Handbook of Philosophical Logic, Volume 11, Dordrecht: Kluwer Academic.
Gödel, K. (1931), On Formally Undecidable Propositions of Principia Mathematica and Related Systems, New York: Dover.
Gupta, A. (1982), “Truth and Paradox”, Journal of Philosophical Logic 11: 1-60.
Hardy, J. (1995), “Is Yablo’s Paradox Liar-like?” Analysis 55: 197-198.
Herzberger, H. (1982a), “Notes on Naïve Semantics”, Journal of Philosophical Logic 11: 61-102.
Herzberger, H. (1982b), “Naïve Semantics and the Liar Paradox”, Journal of Philosophy 79: 497.
Ketland, J. (2004), “Bueno & Colyvan on Yablo’s Paradox”, Analysis 64: 165-172.
Ketland, J. (2005), “Yablo’s Paradox and ω-Inconsistency”, Synthese 145: 295-302.
Kripke, S. (1975), “Outline of a Theory of Truth”, Journal of Philosophy 72: 690-716.
Leach-Krouse, G. (2014), “Yablifying the Rosser Sentence”, The Journal of Philosophical Logic 43(5): 827-834.
Leitgeb, H. (2002), “What is a Self-Referential Sentence? Critical Remarks on the Alleged (Non)-Circularity of Yablo’s Paradox”, Logique et Analyse 177: 3-14.
Picollo, L. (2013), “Yablo’s Paradox in Second-Order Languages: Consistency and Unsatisfability”,Studia Logica 101: 601-613.
Priest, G. (1995), Beyond the Limits of Thought, Cambridge: Cambridge University Press.
Priest, G. (1997), “Yablo’s Paradox”, Analysis 57: 236-242.
Rabern, L., B. Rabern, & M. Macauley (2013), “Dangerous Reference Graphs and Semantic Paradoxes”, Journal of Philosophical Logic 5: 727-765.
Schlenker, P. (2007a), “The Elimination of Self-Reference: Generalized Yablo Series and the Theory of Truth”, Journal of Philosophical Logic 36: 251-307.
Schlenker, P. (2007b), “How to Eliminate Self-Reference: A Precis”, Synthese 158: 127-138.
Sorensen, R. (1998), “Yablo’s Paradox and Kindred Infinite Liars”, Mind 107: 137-155.19
Visser, A. (2002), “Semantics and the Liar Paradox”, in Gabbay and Guenthner (2002): 149-240.
Yablo, S. (1985), “Truth and Reflection”, Journal of Philosophical Logic 14: 297-349.
Yablo, S. (1993), “Paradox without Self-Reference”, Analysis 53: 251-252.
Yablo, S. (2006), “Circularity and Paradox”, in Bolander, Hendricks, & Pedersen (2006): 165-183.

Author Information

Roy Cook
Email: roycookparadox@gmail.com
University of Minnesota
U. S. A.

David Hume: Imagination

David Hume (1711–1776) approaches questions in epistemology, metaphysics, ethics and aesthetics via questions about our minds. For example, before addressing the epistemological question of whether we have any justification for our beliefs about unobserved states of affairs, Hume asks which of our cognitive faculties is responsible for these beliefs. Before addressing the metaphysical question, What is causal necessity (or necessary connexion)? , he asks what idea we have of a necessary connection between a cause and its effect. And before addressing the ethical question of why we are morally obligated to treat other people justly, he asks why we naturally sympathize with people whose interests suffer due to injustice. Hume tries to answer these and other questions about our minds empirically (that is, by observing himself and other people) and systematically. He calls this project his “science of man”; today, we would regard it as an amalgam of philosophy of mind, psychology and sociology.

One of the main discoveries that Hume claims to make, as a scientist of man, is that “men are mightily govern’d by the imagination.” He argues that the faculty of imagination is responsible for important features both of each individual human being’s mind and of the social arrangements that human beings form collectively. Concerning each individual human being’s mind, Hume argues that the imagination explains how we can form “abstract” or “general” ideas (that is, ideas that represent categories of things); how we reason from causes to their effects, or from effects to their causes; why we tend to sympathize, or share the feelings of other people; and why we project some of our feelings onto objects in the world around us. He also argues that the imagination explains numerous “fictions” that we believe. Concerning human social arrangements, Hume argues that features of the imagination explain why we need to form governments, and shape the laws that we adopt, including those that govern the distribution of property and those that govern the passage of national authority from one monarch to the next.

This article starts by explaining Hume’s views about thought in general. It then focuses on his views about imaginative thought in particular. It explains his conception of the imagination and its relations to our other faculties of thought, highlighting the continuities and discontinuities between his views and those of his Early Modern predecessors. It then presents some of the basic functions that Hume thinks the imagination performs, and surveys some highlights of his science of man, showing how he uses the imagination’s basic functions to explain several important mental phenomena. It then examines “fictions of the imagination,” which have an important place in his science of man, and his view that whatever we can clearly imagine is possible. Lastly, it discusses the relationship between Hume’s theory of the imagination and his skepticism.

Thought, Ideas and the Copy Principle
The Imagination and Our Other Faculties of Thought
Five Basic Functions of the Inclusive Imagination
Four Non-Basic Functions of the Inclusive Imagination
Fictions of the Imagination
Imaginability and Possibility
The Imagination and Hume’s Skepticism
References and Further Reading

1. Thought, Ideas and the Copy Principle

Hume writes that “men are mightily govern’d by the imagination” (T 3.2.7.2; SBN 534). And imagination is a kind of thought. To understand Hume’s views about imaginative thought, specifically, we must first examine some of his views about thought in general: his distinction between impressions and ideas; his distinction between simple and complex perceptions; and his Copy Principle. Hume’s main discussions of these topics are in A Treatise of Human Nature (hereafter, Treatise) Book 1, Part 1, Section 1; paragraphs 5–7 of Hume’s “Abstract” of the Treatise; and Section 2 of An Enquiry Concerning Human Understanding (hereafter, “first Enquiry”).

Hume tries to explain everything that takes place in our minds, including thought, by appealing to perceptions and their interactions. He distinguishes two kinds of perceptions: impressions and ideas (T 1.1.1.1; SBN 1–2; T Abs 5; SBN 647; and E 2.1–3; SBN 17–18). He equates having impressions with “feeling,” or first-hand experience. So, our impressions include all of the sensations, passions and emotions that we experience when we engage in sensory perception, feel painful or pleasurable sensations in our bodies, or feel passions like love and hatred. He equates having ideas with thinking: in his view, thinking about an object, or thinking that a certain state of affairs obtains, involves forming an idea that represents this object or state of affairs. The only difference that Hume sees between impressions and ideas is their degree of force and liveliness, or force and vivacity. Impressions are more forceful and lively than ideas: for example, actually feeling a pain is more forceful and lively than merely thinking about a pain. (Scholars disagree about how to interpret Hume’s talk of force and vivacity. According to some, a perception’s force and vivacity is matter of how it feels to have that perception—that is, a matter of its phenomenology. According to others, a perception’s force and vivacity is a matter of how it behaves in our minds—that is, a matter of its functional role.)

Hume also distinguishes simple and complex perceptions (T 1.1.1.2; SBN 2). This cuts across his distinction between impressions and ideas, so that there are four categories of perception altogether: simple impressions; complex impressions; simple ideas; and complex ideas. A complex perception is made up of parts. For example, when you look at a Granny Smith apple in good light, you experience an array of color-sensations; Hume would regard this array of sensations as a complex impression. When you bite into a Granny Smith apple, you experience a sensation that is made up of various taste-sensations, smell-sensations, tactile sensations of the apple’s texture on your tongue, and so forth. Again, Hume would regard this overall sensation as a complex impression. When you think about a Granny Smith apple, you form a complex idea (or less forceful perception) made up of similar parts. Suppose that we broke one of these complex perceptions up into its parts, and examined each of them individually. Perhaps we would find that some of these perceptions have parts of their own. For example, perhaps your impression of the apple’s taste is itself made up of different parts—a sweet-sensation and a tart-sensation, say. Hume thinks that this process of breaking a perception into parts, and then breaking these parts into parts, could not go on forever. Eventually, we would reach perceptions that have no parts of their own. Hume calls these perceptions simple. He holds that every perception is either simple, or is built up entirely from simple perceptions, in which case it is complex. (Hume does not explicitly distinguish simple from complex perceptions in the “Abstract” or the opening sections of the first Enquiry. Nonetheless, he continues to rely on this distinction: for example, see E 2.5–6; SBN 19, E 3.1; SBN 23 and E 7.4; SBN 62.)

Hume argues that each of your simple ideas is caused by, and exactly resembles, a simple impression that you have previously had—in other words, each of your simple ideas is an exact copy of one of your simple impressions (T 1.1.1.7–12; SBN 4–7; T Abs 6–7; SBN 647–8; E 2.4–9; SBN 18–22). Scholars often call this Hume’s Copy Principle. Since Hume thinks that every idea is either simple or complex, and that a complex idea is entirely made up of simple ones, it follows that every idea is either an exact copy of an impression, or is entirely made up of such copies. Because ideas are less forceful than impressions, Hume calls them “faint images” of our impressions.

According to Hume, then, thinking involves forming a faint image, or assembling a montage of faint images, of sensations, passions, and emotions. Since the imagination is a faculty of thought, it is a faculty by which we form such images.

2. The Imagination and Our Other Faculties of Thought

This section addresses Hume’s views about the faculty of imagination, its parts or sub-faculties, and its relationship to the two other faculties of thought that Hume distinguishes: memory and reason. The main texts on this topic are Treatise Book 1, Part 1, Section 3; Book 1, Part 3, Section 5; and Hume’s footnote to Book 1, Part 3, Section 9, Paragraph 19.

a. Two Senses of “Imagination”

In the Treatise, Hume explains that he uses the word “imagination” (and its synonym, “fancy”) in two different senses:

When I oppose the imagination to the memory, I mean the faculty, by which we form our fainter ideas. When I oppose it to reason, I mean the same faculty, excluding only our demonstrative and probable reasonings. (T 1.3.9.19n22; SBN 118n)

Some twentieth century scholars thought that these two different senses of “imagination” refer to two completely different mental faculties: a faculty of feigning or make-believe, and a faculty for apprehending real things. (For example, see Kemp Smith 1941: 459.) However, this seems to conflict with what Hume says: that both senses of “imagination” refer to “the same faculty.” Therefore, scholars in the late twentieth and early twenty first century broadly agree on the following interpretation of the two senses. In one sense, “imagination” picks out a faculty that is responsible for all of our thoughts except for memories. In this sense, the imagination includes our faculty of “reason”—by which Hume here means our faculty for carrying out “demonstrative or probable reasonings”—as one of its parts or sub-faculties. Let us therefore call it the inclusive imagination. In the second sense that Hume distinguishes, “imagination” picks out the non-rational part or sub-faculty of the inclusive imagination—the part that is not reason. Hume indicates that this other, non-rational sub-faculty is responsible for our whimsies and prejudices (T 1.3.9.19n22; SBN 117n), and often suggests that its properties seem to be trivial (T 1.4.2.56, 1.4.6.6n50, 1.4.7.3, 1.4.7.6, 1.4.7.7; SBN 217, 254, 265, 267). Because it excludes reason as well as memory, let us call this non-rational sub-faculty the exclusive imagination.

b. Inclusive Imagination vs. Memory

Hume contrasts the inclusive imagination with the memory. What difference does he see between these faculties? Early in the Treatise, he explains that they differ in two main ways. First, the ideas that make up a memory are “much more lively and strong” than the ideas that we form by the inclusive imagination (T 1.1.3.1; SBN 9). Second, the ideas that make up a memory must occur in the same “order and form,” or “order and position,” as the impressions from which they are copied (T 1.1.3.2–3; SBN 9). For example, when I remember a melody that I have heard, my ideas of the notes composing that melody must occur in the same order as my earlier impressions of those notes. In contrast, the imagination “is not restrain’d to the same order or form with the original impressions” (T 1.1.3.2; SBN 9). I can imagine a melody made up of notes that I have experienced before, but occurring in an order that I have never experienced before.

According to Hume, then, the ideas that we form in the course of remembering things are not completely different from those that we form when we are imagining things. In fact, these ideas are intrinsically alike, save for their degree of force and vivacity: they are all “faint images” of impressions, or montages of such images; but the ideas of the inclusive imagination are even fainter than those of the memory.

c. Exclusive Imagination vs. Reason

Hume distinguishes two parts or sub-faculties within the inclusive imagination: the exclusive imagination and reason. What difference does he see between these sub-faculties? Again, there seem to be two main answers. First, these sub-faculties differ with respect to their function, or what they do. By “reason,” Hume here means the sub-faculty by which we make demonstrative and probable inferences. In contrast, the exclusive imagination is the sub-faculty by which we form non-rational whimsies and prejudices, and various imaginative “fictions,” which are discussed in section (5) below. Second, these sub-faculties differ with respect to the permanence, irresistibility and universality of their operations. Operations of reason, like inferring causes from their effects, are permanent, irresistible and universal features of human minds. In contrast, the whimsies and prejudices due to the exclusive imagination occur only at certain times and in certain places, and they can be avoided with sufficient strength of mind.

However, perhaps Hume thinks that some operations of the exclusive imagination are just as permanent, irresistible, and universal as those of reason. He says that probable reasoning and our belief that sensible objects continue to exist, at times when nobody perceives them, are “equally natural and necessary in the human mind” (T 1.4.7.4; SBN 266). But he also says that our belief in the continued, unperceived existence of sensible objects is a fiction due to the exclusive imagination (T 1.4.2.14–43 and 52; SBN 193–209 and 215). So, he seems to hold that at least one operation of the exclusive imagination—its production of this belief—is just as permanent, irresistible, and universal as the operations of reason.

According to some scholars, Hume uses the word “reason” in different ways, only sometimes using it to pick out the sub-faculty of the inclusive imagination by which we carry out demonstrative and probable reasoning. Therefore, the reader should be careful not to assume that Hume is always talking about this sub-faculty, whenever he talks about reason.

For further discussion of Hume’s contrast (or contrasts) between reason and the imagination, see section (7), below.

d. Continuities between Hume’s Views and His Predecessors’

Among Early Modern philosophers, the imagination was generally conceived as our faculty for forming a distinctive kind of idea: mental “images” that resemble sensory experiences. For example, René Descartes writes that “whatever we conceive with an image is an idea of the imagination” (“To Mersenne, July 1641”; CSMK 186) and explains that imaginative images resemble sensory experience:

When I imagine a triangle, for example, I do not merely understand that it is a figure bounded by three lines, but at the same time I also see the three lines with my mind’s eye as if they were present before me; and this is what I call imagining. (CSM 2:50)

When one imagines a triangle, it is “as if” one were sensing it. Similarly, Thomas Hobbes, Nicolas Malebranche and George Berkeley all characterize the imagination as a faculty for forming ideas that closely resemble sensory experiences.

As we have seen, Hume thinks that every idea is either simple or complex; that every simple idea is copied from a simple impression (that is, from a simple sensation, passion or emotion); and that every complex idea is made up entirely of simple ones. He must therefore accept that all ideas resemble experiences: a simple idea resembles the experience that we have, when we have the simple impression from which it is copied; and a complex idea resembles the experiences that we have, when we have the simple impressions from which its parts are copied. So, much like his predecessors, he holds that all the ideas we form by means of the inclusive imagination resemble sensory experiences—if the word “sensory” is construed in a broad way, so as to include passionate and emotional experiences. (Hume does not think that this is distinctive of the inclusive imagination: in his view, the memory also uses ideas that resemble sensory experiences. However, most of his Early Modern predecessors regarded memory as a kind of imagination, so there is no significant disagreement between him and them on this point.)

Hume’s Early Modern predecessors also thought that the imagination is a faculty by which we make a distinctive kind of transition among ideas: habituated, or associative, transitions. For example, Hobbes thinks that the successions of mental images that take place in the imagination tend to resemble the successions of sensory experiences that gave rise to those mental images. Similarly, Malebranche holds that each type of mental image is paired with a type of physical image or “trace” in the brain, and that these physical traces come to be connected or associated with each other; thanks to these connections among physical traces, the mental images that are paired with them also come to be associated. And Gottfried Leibniz writes that, in both human and non-human animal minds, the perceptions of the memory or imagination come to be associated by a kind of habituation.

Hume agrees with these philosophers that the inclusive imagination serves to associate its ideas, or faint images, with other perceptions; this applies to both of its parts or sub-faculties—reason and the exclusive imagination. Below, section (3c) discusses Hume’s views of imaginative association in detail; section (4) discusses some of its applications.

e. Discontinuities between Hume’s Views and His Predecessors’

There are two main discontinuities between Hume’s views of the imagination and those of earlier modern philosophers. First, numerous Early Modern philosophers held that we have a faculty of pure understanding or pure intellect, by which we can form purely intellectual ideas—ideas that are completely unlike sensations, passions, or emotions. For example, Descartes claims that we can understand the difference between a chiliagon (a 1,000-sided shape) and a myriagon (a 10,000-sided shape), but that we cannot represent this difference to ourselves by forming mental images. This is because the image that we form in trying to imagine a chiliagon is no different from the one that we form in trying to imagine a myriagon: in each case, the best that we can do is to make a fuzzy attempt to depict many sides (CSM 2:50). More importantly, by Descartes’s lights, we can form ideas of incorporeal things, such as God and the human soul, but we cannot imagine such things, because imagining is “simply contemplating the shape or image of a corporeal thing” (CSM 2:19). Descartes infers that, when we grasp the difference between a chiliagon and a myriagon, or conceive of God, we do so by forming a purely intellectual idea. Antoine Arnauld, Malebranche, Benedict De Spinoza and Gottfried Leibniz also posit purely intellectual ideas, in order to explain certain kinds of human thought.

Hume does not include a faculty of pure intellect in his taxonomy of mental faculties: he thinks that all ideas are faint images, or montages of such images, belonging to the memory or to the inclusive imagination. He has at least three reasons for denying that we have a faculty of pure intellect. First, he argues that the Copy Principle rules out purely intellectual ideas (T 1.3.1.7; SBN 72–73). Copying involves resemblance: a copy resembles the original from which it is made. So, if every simple idea is an exact copy of a simple impression, and every complex idea is composed wholly of simple ideas, then every idea resembles an impression or several impressions. So, no idea is purely intellectual.

Second, Hume gives specific arguments against the existence of certain purely intellectual ideas that Descartes and his followers had posited. For example, Descartes argued that we conceive the nature of a particular material substance, like a piece of wax, by means of the pure intellect. In contrast, Hume argues that we can only conceive a substance to be a collection of sensible qualities “united by the imagination” (T 1.1.6.1–2; SBN 15–16). So, we conceive of substances by means of the imagination, not by means of a purely intellectual idea. Similarly, Descartes held that the idea of God is purely intellectual. In contrast, Hume argues that one can form an idea of God by augmenting one’s idea of one’s own mind with further ideas copied from impressions (E 2.6; SBN 19).

Third, Hume thinks that his opponents’ principal reason for positing a faculty of pure intellect is to “explain our abstract ideas, and to show how we can form an idea of a triangle, for instance, which shall neither be an isosceles nor a scalenum, nor be confin’d to any particular length or proportion of sides” (T 1.3.1.7; SBN 72). Hume thinks that he can explain our abstract ideas just by appealing to the ideas and basic functions of the inclusive imagination. If this explanation succeeds, then it shows that we do not need a faculty of pure intellect in order to form abstract ideas. Therefore, Hume thinks that his explanation undermines his opponents’ principal reason for positing this faculty. Section (4a), below, discusses Hume’s account of abstract ideas in more detail.

The second main discontinuity between Hume’s views and those of his predecessors concerns reasoning. According to many of Hume’s predecessors, including Descartes, Leibniz and John Locke, reasoning involves mental events or processes that are both rational and basic, meaning that they cannot be explained in terms of any simpler mental events or processes. In contrast, Hume tries to explain reasoning in terms of more basic mental functions, which are common to reason and to the exclusive imagination. For example, he thinks that the same basic principles of association among ideas explain both flights of fancy due to the exclusive imagination and probable inferences due to reason. This is why Hume regards reason and the exclusive imagination as two sub-faculties of the inclusive imagination: their functions are built up out of the same basic imaginative functions.

Because Hume places our whole faculty of reason within the inclusive imagination, it seems he must say that demonstrative reasoning can be explained in terms of functions that are common to reason and the exclusive imagination. But it is a hard question whether he can carry out this explanation successfully. For a helpful discussion of this issue, see Owen (1999, 92 and 96–98).

3. Five Basic Functions of the Inclusive Imagination

As a scientist of man, Hume aims to explain some mental functions in terms of others. More specifically, he aims to take a complex and initially puzzling mental function, like our ability to carry out sophisticated pieces of probable reasoning, and show how this function is built up out of several simpler and less puzzling functions. Of course, Hume does not go on forever in this process of breaking down complex mental functions into simpler ones. His science of man leaves some mental functions unexplained. These functions are the basic building blocks from which other, more complex, mental functions are built. This section presents five of the main basic functions that Hume attributes to the inclusive imagination—that is, functions that his science of man does not try to explain. The next two sections show how he uses these basic functions to explain several other, more complex mental and social phenomena—some due to reason, others to the exclusive imagination.

Two caveats here: First, saying that a function of the inclusive imagination is basic does not mean that it has no explanation at all, in Hume’s view. He suggests that events taking place in our brains might explain the inclusive imagination’s basic functions (T 1.2.5.20; SBN 60–61). But he does not insist that this is true, and he remains officially agnostic about what, if anything, explains these functions; this question falls outside the scope of his science of man. Second, saying that a mental function is basic does not mean that we have no evidence that it takes place. Hume seems to think that each of us can observe the basic imaginative functions taking place in our minds. So, we have observational evidence that they take place, even though we do not have a scientific explanation of how or why they do. He may also think that his success in using these basic functions to explain other, more complex phenomena gives him further evidence that they take place in our minds (Owen 1999, 77–79).

a. Forming Faint Copies of Simple Impressions

Hume thinks that each of our ideas is either copied from a simple impression (per the Copy Principle), or is built up entirely from simple ideas that are so copied. If our minds could not reproduce our simple impressions, by forming simple ideas copied from them, then we could not form any ideas at all. So, the function of reproducing simple impressions by forming copies of them must be common to the inclusive imagination and the memory—our two faculties for forming ideas. Of course, the copies or simple ideas that we form by means of the inclusive imagination have a lower degree of force, liveliness, or strength than those that we form by means of the memory. Hume does not try to explain how the inclusive imagination forms faint copies of our simple impressions; he simply observes that it does. Hence, this is a basic function of the inclusive imagination. Hume’s main discussions of this function are in Treatise Book 1, Part 1, Section 1; and in the first Enquiry, Section 2.

b. Manipulating the Parts of Ideas

Once we have acquired some ideas by forming copies of our impressions, the inclusive imagination can manipulate their parts in various ways. Hume gives several overlapping lists of these ways: for examples, see the Appendix to the Treatise, paragraph 2; the “Abstract” of the Treatise, paragraph 35; and the first Enquiry, Section 2, paragraph 5. Perhaps the most important two are i) what Hume calls separating or dividing ideas, and ii) what he calls uniting, compounding, or composing them. The inclusive imagination can break any complex idea into parts. For example, it can take an idea of a goat and break it down into an idea of the goat’s head, an idea of its torso, ideas of its legs, and so forth. This is what Hume calls separating or dividing ideas. Once it has broken some ideas down into their parts, it can reassemble these parts at pleasure. For example, it can combine the idea of a goat’s head with the idea of a lion’s body, so as to form the idea of a chimera. This is what Hume calls uniting, compounding, or composing ideas.

These two functions of the inclusive imagination are captured by Hume’s Liberty Principle, which says that the imagination is free to “transpose and change its ideas” by separating and re-uniting their parts (T 1.1.3.4; SBN 10). The Liberty Principle plays an important role in Hume’s philosophy by supporting his Separability Principle, which says that “whatever objects are different are distinguishable, and . . . whatever objects are distinguishable are separable by the thought and imagination” (T 1.1.7.3; SBN 18); by “objects,” here, Hume seems to mean the objects of thought or imagination—that is, the things of which we think or which we imagine. (For a helpful discussion of Hume’s varied use of the word “object,” see Grene 1994.) In turn, the Separability Principle underwrites some of Hume’s central arguments—for example, his argument that the proposition “whatever begins to exist, must have a cause of existence” is “neither intuitively nor demonstratively certain” (T 1.3.3.1–3; SBN 78–80).

A third important way of manipulating the parts of our ideas is what Hume calls augmenting our ideas: in other words, replicating a part of an idea and adding the replica back to the original idea, so as to produce an idea of something larger than what the original idea represented. Hume thinks that this allows us to form an idea of God, using only ideas that are copied from impressions (E 2.6; SBN 19).

Hume does not explain how the inclusive imagination manipulates the parts of its ideas in these ways. Doing so is another of its basic functions.

c. Association

Hume thinks that the inclusive imagination naturally associates some perceptions with others. He usually speaks of the association of ideas, but in some of the most important cases that he discusses, an idea is associated with an impression. For example, he claims that probable reasoning involves “a relation or association in the fancy betwixt the impression and idea” (T 1.3.8.7; SBN 101). Hume’s main discussions of association are in Treatise Book 1, Part 1, Section 4, and in the first Enquiry, Section 3.

According to Hume, there are three principles of association among ideas—in other words, there are three basic laws of the inclusive imagination, describing the ways in which ideas become associated with each other or with impressions. First, ideas tend to become associated if the objects that they represent resemble each other. In Hume’s example, an idea of a picture is associated with an idea of the object(s) that this picture depicts (E 3.3; SBN 24). Second, ideas tend to become associated if the objects that they represent are contiguous to each other, meaning that they are near to each other in space or time. In Hume’s example, the idea of an apartment in a building is associated with ideas of the other apartments in that building (ibid.). Third, ideas tend to become associated if the objects that they represent are causally related. In Hume’s example, the idea of a wound is associated with an idea of the pain caused by that wound (ibid.). Really, then, the phrase “association of ideas” covers three functions of the inclusive imagination.

Hume stresses that these three functions of the inclusive imagination are basic, or left unexplained by his science of man, while also stressing that we have evidence that the inclusive imagination performs these functions:

These are therefore the principles of union or cohesion among our simple ideas . . . Here is a kind of ATTRACTION, which in the mental world will be found to have as extraordinary effects as in the natural, and to show itself in as many and as various forms. Its effects are every where conspicuous; but as to its causes, they are mostly unknown, and must be resolv’d into original qualities of human nature, which I pretend not to explain. Nothing is more requisite for a true philosopher, than to restrain the intemperate desire of searching into causes, and having establish’d any doctrine upon a sufficient number of experiments, rest contented with that, when he sees a farther examination wou’d lead him into obscure and uncertain speculations. (T 1.1.4.6; 12–13. See also T 1.1.7.15; SBN 24)

Hume uses these three basic functions of the inclusive imagination to explain numerous other, more complex mental functions—some due to reason, others to the exclusive imagination. Sections (4) and (5), below, discuss some important examples. Hume is especially proud of this aspect of his science of man. He writes that his “use . . . of the principle of the association of ideas” is what, “if any thing,” can entitle him to “so glorious a name as that of an inventor” (T Abs 35; SBN 661).

In his discussions of the passions, Hume expands his account of association in several ways. Most importantly, he adds that the association of two ideas is strengthened if they are accompanied by impressions that resemble each other. This is because resembling impressions are themselves associated, and the two associative relations—that of the ideas, and that of the accompanying impressions—combine to give the mind a “double impulse” to move, associatively, from the first idea-impression pair to the second (T 2.1.4.4; SBN 284). Hume calls this phenomenon the “double relation of ideas and impressions” (T 2.1.5.5; 286–7). He uses it to explain the passions of pride, humility, love, and hatred.

Hume also adds that, in its associative transitions, the inclusive imagination gravitates towards objects that are more important, or closer to oneself in space and time. An idea that represents a relatively unimportant object—for example, a servant—tends to produce ideas of relatively important objects that are associated with it—for example, the servant’s master; in contrast, an idea of a master does not tend to produce an idea of his servant. Similarly, an idea that represents a relatively distant object—for example, one of Jupiter’s moons—tends to produce ideas of relatively nearby objects that are associated with it—for example, an idea of the Earth’s moon; in contrast, an idea of the Earth’s moon does not tend to produce an idea of one of Jupiter’s moons. When two ideas are associated in such a way that each of them is equally likely to be accompanied or followed by the other, Hume says that there is a “perfect relation” between the objects that they represent (T 2.2.4.10; 355).

d. Transmitting Force and Liveliness among Associated Perceptions

Hume thinks that impressions have more force and liveliness (or vivacity) than ideas in the memory, which in turn have more force and liveliness than ideas in the imagination. But he does not think that the latter all have the same low degree of force and liveliness; some of them have a higher degree than others. This allows him to explain the difference between an idea of a contingent state of affairs that we believe to obtain and an idea of one that we merely entertain or think about. Consider two people: someone who believes that there will be a third world war, and someone who entertains the thought that there will be one, but does not believe it. Hume will say that each of these people forms an idea in her imagination. (If the idea is a product of probable reasoning, it belongs to reason and hence to the inclusive imagination; if it is a whimsy or flight of fancy, it belongs to the exclusive imagination.) He will also say that each of their ideas represents the same thing—namely, a third world war’s taking place in the future. But there is a difference between the two ideas, which Hume must explain: one of them is a belief; the other is not. His explanation is that the former idea has more force and liveliness than the other.

Hume defines a belief as “a lively idea related to or associated with a present impression” (T 1.3.7.5; SBN 96). This is because he thinks that an idea must inherit its force and liveliness, directly or indirectly, from an impression. The impression gives force and liveliness to the idea, and thereby turns that idea into a belief; this belief, in turn, can give force and liveliness to other ideas.

In order for one perception to give force and liveliness to another, these perceptions must be associated: associative relations are akin to “pipes or canals” through which force and liveliness can flow (T 1.3.10.7; SBN 122). Hume thinks that associative links due to causation transmit a higher degree of force and liveliness than those due to resemblance or contiguity (T 1.3.9.8; SBN 110).

Transmitting force and liveliness among associated perceptions—especially, among those associated due to causation—is a fourth basic function of the inclusive imagination. In the Treatise, Hume uses it, together with the principles of association of ideas, to explain several important mental phenomena, including probable reasoning and sympathy. Sections (4b) and (4c), below, discuss these phenomena.

Hume’s main discussions of the transmission of force and liveliness are in Treatise Book 1, Part 3, Sections 7–9. Shortly after writing these sections, Hume seems to have changed his view about the nature of belief. In an Appendix published in the following year, together with Treatise Book 3, he wrote that two ideas of the same object can differ in ways other than their degree of force and vivacity (T App 22; SBN 636), and that “reflection on general rules keeps us from augmenting our belief upon every encrease of the force and vivacity of our ideas” (T 1.3.10.12App; SBN 632). This suggests that he no longer identified belief with a higher-than-usual degree of force and vivacity. Later, in the first Enquiry, he refrained from explicitly likening beliefs to impressions, in respect of their force and vivacity (E 5.11–13; SBN 48–50). How significantly did he change his views? Commentators disagree: for two different perspectives, see Owen (1999, 172–4) and Wilbanks (1968, 29–30). Whatever the answer may be, Hume clearly continued to hold that an idea is “enlivened” or receives additional “force and vigour” (E 5.15; SBN 51) when it is associatively related to an impression.

e. Completing the Union of Related Objects

Hume claims that, when our ideas of two objects are associated by certain relations, we tend to imagine further relations among them, in order to “compleat the union” (T 1.4.2.55; SBN 217); his main discussions of this function are in Treatise Book 1, Part 4, Sections 2 and 5.

Hume claims that this basic function of the inclusive imagination explains why those who believe in external objects that cause their impressions tend to believe that these objects also resemble their impressions: they add the relation of resemblance to that of causation in order to complete the union between the external object and the impression (T 1.4.2.55; SBN 217). Similarly, this function explains why we believe that sounds, tastes, and smells have spatial locations. In Hume’s view, these sensible qualities “exist nowhere” (T 1.4.5.10; SBN 235–6)—they do not have spatial locations. But we typically experience the taste and smell of an olive, say, at the same time as experiencing the olive itself; and we take the olive to cause its taste and smell. Because of our tendency to complete the union of related objects, we imaginatively add the relation of spatial contiguity to those of temporal contiguity and causation. In other words, we imagine that the olive’s taste and smell are located where the olive is.

Most importantly, Hume uses this basic imaginative function to explain certain forms of projection—our mind’s tendency to “spread itself on external objects” (T 1.3.14.25; SBN 167). Projection plays an important role in his theories of causal necessity and moral value. Section (4d), below, discusses it.

4. Four Non-Basic Functions of the Inclusive Imagination

Much of Hume’s philosophical work aims to explain how the inclusive imagination’s basic functions work together with each other and with other features of our minds, such as our passions, to produce complex mental and social phenomena. This section focuses on four important examples: abstract ideas, probable reasoning, sympathy, and projection. The next section focuses on an important class of examples that fall under the heading of “fiction.”

a. Abstract Ideas

Hume says that every idea is individual or particular. By this, he means both that the idea itself is a particular (not a universal) and that it represents a particular object: when we form an idea, “the image in the mind is only that of a particular object” (T 1.1.7.6; SBN 20). However, we are not restricted to thinking of one particular thing at a time. We can grasp thoughts like all dogs are mammals and all triangles are shapes. If an idea represents just one particular object, then how can we do this—how can we think of all the particular dogs that exist, or all the particular triangles? Hume’s answer is that a “particular idea” comes to serve as an “abstract” or “general” representation; in other words, it comes to represent all the particular things of some sort. He explains how this happens by appealing to the association of ideas.

Hume proposes that an idea serves as a general representation, if it is called to mind by a word—a common noun like “dog” or “triangle”—which is associated with many ideas of resembling objects. Suppose that, on hearing the word “dog,” you happen to form an idea of a particular dog, Fido. If it occurred on its own, this idea would represent just this one particular dog. But when it occurs in partnership with a word that is also associated with many other ideas of particular dogs (Spot, Rover, and so forth), the idea of Fido serves as a proxy for those other ideas (T 1.1.7.7–10; SBN 20–22). Hence, it serves as a representation of all dogs.

This explanation involves two of Hume’s principles of association. First, it involves contiguity. We have often uttered the word “dog,” or (more probably, if we are learning a language) have often heard this word uttered, in the presence of Fido. This contiguity in space and time between the word “dog” and Fido leads us to associate that word with him. Second, it involves resemblance. Because Fido, Spot, Rover, and other dogs resemble each other in many important ways, we come to associate the same term with each of them. Also thanks to this resemblance, an idea of one of these dogs tends to be followed by one or more ideas of the other dogs. Hume thinks that this helps one idea to serve as a proxy for the others.

Hume thinks that the main reason why other philosophers have posited a faculty of pure intellect, distinct from the inclusive imagination, is to “explain our abstract ideas, and to show how we can form an idea of a triangle, for instance, which shall neither be an isosceles nor a scalenum, nor be confin’d to any particular length or proportion of sides” (T 1.3.1.7; SBN 72). He presumably thinks that his own account of abstract ideas undermines this reason: it shows that the inclusive imagination can explain our abstract ideas; so, there is no need to posit an additional faculty of pure intellect.

Hume’s main discussion of this non-basic function of the inclusive imagination is in Treatise Book 1, Part 1, Section 7.

b. Probable Reasoning

By “probable reasoning,” “moral reasoning,” or “reasoning concerning matter of fact,” Hume means reasoning to beliefs about matters of fact that we have not observed. For example, we all believe that the sun will rise tomorrow. But this belief is not due to observation: we cannot have observed the sun’s rising tomorrow, because it has not happened yet. So, our belief that the sun will rise tomorrow must be due to probable reasoning: we must have reasoned our way to this belief, based on other things that we have observed. Hume distinguishes two main kinds of probable reasoning, which he calls proofs and probabilities (T 1.3.11.2; SBN 124). A proof is a piece of probable reasoning whose conclusion is “entirely free from doubt and uncertainty” (T 1.3.11.2; SBN 124). For example, we have no doubt that the sun will rise tomorrow. So, the piece of probable reasoning that leads us to this conclusion is a proof. A probability is a piece of probable reasoning whose conclusion is “still attended with uncertainty” (T 1.3.11.2; SBN 124). For example, when I have a headache, I believe with some confidence that taking acetaminophen will cure it. But I do not believe this with complete certainty: taking acetaminophen usually cures my headaches, but not always. So, the piece of probable reasoning that leads me to conclude that taking acetaminophen will cure my current headache is a probability. (To avoid confusion, it is important to recognize that Hume uses the terms “probable reasoning” and “probability” in two senses: i) an inclusive sense, in which these terms denote the genus of reasoning whose species are proofs and probabilities; and ii) an exclusive sense, in which they denote just one species of this genus—probabilities as opposed to proofs. For examples of the inclusive sense, see T 1.3.6.4, 1.3.6.6–7 and 1.3.9.19n22; SBN 89, 89–90 and 117–18n; for examples of the exclusive sense, see T 1.3.11.2; SBN 124. In this section, I will use “probable reasoning” only in the inclusive sense, and “probability” only in the exclusive sense.)

Hume observes that our ordinary actions and our scientific inquiries—including those that he himself conducts, as a scientist of man—depend on probable reasoning and the beliefs that it produces. Therefore, it is especially important to him to explain how our minds carry out this kind of reasoning. He argues that probable reasoning is a non-basic function of the inclusive imagination, built up from two basic ones: association, and the transmission of force and vivacity among associated perceptions.

To see this, let us consider Hume’s favorite example of an elementary proof: we see one billiard ball hurtling across the table towards a second ball, which is unobstructed; and we form the belief—without any doubt or uncertainty—that the two balls will collide, and that the second ball will start to move. In order to explain this piece of reasoning, Hume breaks it down into three parts (T 1.3.5.1; SBN 84). The first part is our “original impression”—in this case, a sensory impression of the two billiard balls. In general, Hume avoids the question of how our sensory impressions are produced, so he leaves this part of our reasoning unexplained.

The second part of our probable reasoning is a mental “transition” from our original impression to an idea that represents the two balls colliding, and the second ball starting to move. Hume famously argues that this transition is due to imaginative association. In the past, whenever we have observed billiard balls in similar situations—one ball hurtling towards another, unobstructed, ball—we have observed the balls to collide, and the second start to move. This course of past experience has established an associative relation: a perception of billiard balls in this situation now calls to our mind an idea of the balls colliding, and the second starting to move. It is due to this associative relation, Hume claims, that the sight of billiard balls in this situation now causes us to form such an idea (T 1.3.6.12–16; SBN 92–94). This is an example of association by causation—one of the three principles of association that Hume identifies; see section (3c), above. Hume thinks that only causation can inform us about unobserved matters of fact: that is, we can only learn about an unobserved matter of fact if it is causally related to some other matter, or matters, of fact that we have observed (T 1.3.2.2–3; SBN 73–74, E 4.4–5; SBN 26–27). So, he thinks that all probable reasoning involves association by causation.

The third part of our probable reasoning is the transmission of force and liveliness to our idea, so that we believe—not just entertain the thought—that the billiard balls will collide and that the second one will start to move. Once he has established that imaginative association explains our transition from our impression of the billiard balls to this idea, this third part of our reasoning is easy for Hume to explain. Impressions have a high degree of force and liveliness, and transmitting force and liveliness among associated perceptions is a basic function of the inclusive imagination. So, we should expect the transmission of force and liveliness from our impression to our idea of the two billiard balls colliding, and the second one’s starting to move. As a result, our idea becomes a belief.

Hume writes that probabilities are “deriv’d from the same origin”—that is, from the same basic functions of the inclusive imagination—as proofs (T 1.3.11.1; SBN 124). In the Treatise, he distinguishes three kinds of probability: the probability of chances; the probability of causes; and probability arising from analogy (T 1.3.11.3, 1.3.12.25; SBN 124–5, 142). We rely on the probability of chances and the probability of causes when we do not have a large, uniform body of past experience concerning the matters of fact about which we are reasoning. For example, when I roll a fair, six-sided die, I do not have a uniform body of past experience concerning which face will land uppermost: in my past experience, rolling the die has sometimes been followed by one face landing uppermost, sometimes by another face landing uppermost. But if the die has four faces marked with squares, and only two marked with circles, I come to believe with some confidence that one of the faces marked with a square will land uppermost; this belief derives from the probability of chances. Similarly, when I take acetaminophen in the hopes of curing my headache, I do not have a uniform body of past experience concerning the curing of my headache: in my past experience, taking acetaminophen has usually been followed by the curing of a headache—but not always. Again, I come to believe with some confidence that taking acetaminophen on this occasion will be followed by the curing of my headache; this belief derives from the probability of causes. Hume argues that, like proofs, both the probability of chances and that of causes are explained by the association of ideas and the transmission of force and vivacity between associated perceptions.

We rely on probability arising from analogy when we observe a matter of fact that bears some resemblance, but not a perfect resemblance, to matters of fact that we have previously observed. Suppose that I have a large body of past experience of Labradors in which, whenever a Labrador has approached me with its tail wagging, it has then greeted me effusively; suppose, also, that I have no past experience of German Shepherds, but that I now see one approaching me with its tail wagging. Because this German Shepherd does not perfectly resemble anything that I have previously experienced, I do not have a proof that it will greet me effusively. But, because it bears some resemblance to the Labradors that I have experienced, I believe with some confidence that it will greet me effusively. According to Hume, this belief is due to probability arising from analogy—in this case, the analogy between the German Shepherd that I now experience and the Labradors that I have previously experienced. Hume holds that this species of probability is explained by the same basic functions of the inclusive imagination as proofs, the probability of chances, and the probability of causes (T 1.3.12.25; SBN 142). In the first Enquiry, Hume does not class analogy as a third species of probability; instead, he writes that all probable reasoning—including proofs, as well as probabilities—is “founded on a species of Analogy” (E 9.1; SBN 104).

When we carry out simple pieces of probable reasoning, we do so reflexively. For example, when we see one billiard ball hurtling towards another, we immediately form the belief that the balls will collide, and that the second will start to move; we need not reflect on our past experiences, or construct an argument, in order to do so. Not all probable reasoning is like this. More sophisticated pieces of probable reasoning are reflective, not reflexive: they involve reflection on past experience, and the construction of arguments. (For the “reflexive/reflective” distinction, see Owen 1999, 149–50.) But Hume explains this reflective kind of probable reasoning in terms of the reflexive kind. In order to carry out reflective probable reasoning, we need to establish general principles to serve as premises in our arguments, such as “the principle, that like objects, plac’d in like circumstances, will always produce like effects” (T 1.3.8.14; SBN 105; see also T 1.3.12.7–12; SBN 133–5). And we cannot begin to establish such principles, except by means of reflexive probable reasoning. For example, we reflexively believe that like objects in like circumstances will produce like effects because “we have many millions [of experiments] to convince us of this principle,” and so “this principle has establish’d itself by a sufficient custom” (ibid.).

So, Hume explains sophisticated, reflective probable reasoning by showing how it is built up from unsophisticated, reflexive probable reasoning; and, as we have seen, he explains unsophisticated, reflexive probable reasoning in terms of two basic functions of the inclusive imagination: association and the transmission of force and liveliness. In Hume’s view, then, we can conduct sophisticated research in the empirical sciences only thanks to the inclusive imagination and its basic functions.

Hume’s main discussions of proofs are Treatise Book I, Part 3, Sections 4–7; the “Abstract” of the Treatise, paragraphs 8–23; and the first Enquiry, Sections 4 and 5. His main discussions of probabilities are Treatise Book 1, Part 3, Sections 11–13; and the first Enquiry, Section 6.

c. Sympathy

In Hume’s view, to “sympathize” is to share the feelings of a person whom one encounters. At the sight of a cheerful face, one tends to feel more cheerful oneself. Similarly, at the sight of an angry or sorrowful face, one’s own mood is dampened. In each case, a sentiment or feeling of the person observed is communicated, by sympathy, to the observer.

Hume’s account of sympathy resembles that of probable reasoning in two ways. First, he explains sympathy in terms of the same two basic imaginative functions: association and the transmission of force and vivacity among associated perceptions. Second, as with probable reasoning, Hume distinguishes reflexive and reflective forms of sympathy.

Consider an example of the reflexive form of sympathy: you meet a joyful person, and consequently feel the passion of joy yourself. Hume distinguishes two components within this process. First, a piece of probable reasoning: you observe the effects of joy in the other person’s voice and gestures; from your observation of these effects, you infer the presence of joy in her mind (T 2.1.11.3, 3.3.1.7; SBN 317, 575–6). This first component explains why you should come to believe in the presence of joy in the other person’s mind. But it does not yet explain why you should come to feel the passion of joy yourself. This explanation comes from the second component that Hume discerns in the process of sympathizing. He claims that you always have a very forceful and vivacious perception of yourself (T 2.1.11.4; SBN 317). Since you are both human beings, the joyful person whom you have met resembles you closely, and—in the case we are now considering—she and the joy that she feels are contiguous to you in space and time. Thanks to these relations of resemblance and contiguity, your very forceful and lively perception of yourself is associated with your idea of this other person and the joy that she feels. Thanks to the inclusive imagination’s basic function of transmitting force and vivacity among associated perceptions, your idea of the other person’s joy receives an extra dose of force and vivacity from your perception of yourself. Your idea of the other person’s joy therefore becomes an impression of joy—that is, it becomes an actual instance of the passion of joy (T 2.1.11.5–7; SBN 319). In Hume’s view, this is the process by which we come to sympathetically share the passions or feelings of other people. As we can see, it involves the same two basic imaginative functions twice over: association and the transmission of force and vivacity among associated perceptions explain both the initial piece of probable reasoning that produces an idea of the other person’s passion, and the extra dose of force and vivacity that this idea receives, which turns it into a passion.

Hume argues that our moral sentiments—the approval that we feel when considering someone’s virtues, and the disapproval when considering her vices—derive from sympathy (T 3.3.1.6–26; SBN 575–89). When we consider somebody with a character-trait that is useful to those around her—generosity, for example—we sympathetically share the pleasurable passions of joy and gratitude that this character-trait induces in the people who benefit from it. Because we share these pleasurable passions, we morally approve of the character-trait that causes them.

If all our sympathetic responses were reflexive, however, then our sentiments of moral approval and disapproval would fluctuate wildly. As people become more distant from us in space and time, our ideas of them and their passions become less strongly associated with our forceful and vivacious perceptions of ourselves; we therefore sympathize less strongly with them. So, if all of our moral sentiments derived from reflexive sympathy, we would not approve as much of past virtues as we do of present ones, and we would not approve as much of the virtues of spatially distant people as we do of the virtues of people living close to us. However, our moral sentiments do not in fact fluctuate in these ways: “we give the same approbation to the same moral qualities in China as in England” (T 3.3.1.14; SBN 581). Hume explains that this is because our moral sentiments derive from a more sophisticated form of sympathy, in which we “correct” our sentiments by a kind of “reflection” (T 3.3.1.17; SBN 583). When we sympathize in this reflective way, we consider only the ways in which a person’s character tends to affect the people with whom she interacts—“those, who have any commerce with the person we consider” (T 3.3.1.18; SBN 583). We base our moral sentiments not on how reflexive sympathy makes us feel, but on how reflective sympathy tells us that we would feel, if we were to encounter the person whose character we are evaluating, and the people whom she directly affects.

Hume holds that the reflective kind of sympathy from which our moral sentiments derive is a corrected form of reflexive sympathy; and, as we have seen, he explains reflexive sympathy in terms of two basic functions of the inclusive imagination—association and the transmission of force and liveliness. In Hume’s view at the time of writing the Treatise, then, we owe our moral sentiments, like our capacities for abstract thought and probable reasoning, to the imagination and its basic functions.

Hume’s main discussions of sympathy are in Treatise Book 2, Part 1, Section 11; and Book 3, Part 3, Section 1. In his Enquiry Concerning the Principles of Morals (hereafter, second Enquiry), Hume does not discuss sympathy as extensively, or in as much detail, as he does in the earlier Treatise. This leads some commentators to think that he changed his views about the origins of our moral sentiments, in between writing these works. Abramson (2001) argues convincingly that this is not the case, and that the imaginative mechanism of reflective sympathy plays much the same role in the second Enquiry as it does in the Treatise.

d. Projection

In Hume’s view, when we think that one thing causes another, we take there to be a “necessary connexion” between the cause and its effect. Given that the cause happens, we take it that the effect must follow. For example, given that a speeding billiard ball collides with an unobstructed, stationary ball, the latter must start moving; or, given that a burning match is applied to dry kindling in an oxygen-rich environment, the kindling must start burning. Hume investigates at length how we acquire the idea of this necessary connection between cause and effect, and what this idea really represents. He argues that this idea does not represent anything that belongs to, or exists between, the cause and effect themselves. Instead, it represents a feature of our minds: “an internal impression of the mind, or a determination to carry our thoughts from one object to another” (T 1.3.14.20; SBN 165). The “determination” of which Hume writes here is the transition involved in reflexive probable reasoning. By calling this transition an impression, Hume suggests that it has a distinctive feeling—when we see one billiard ball strike another, we feel ourselves determined to believe that the second ball will start moving. When we think or speak of two events as if they were necessarily connected—for example, when we say that a billiard ball must start moving, given that another ball has struck it—we are “spreading” this feeling of determination, which exists in our own mind, onto the events themselves:

’Tis a common observation, that the mind has a great propensity to spread itself on external objects, and to conjoin with them any internal impressions, which they occasion, and which always make their appearance at the same time that these objects discover themselves to the senses. Thus . . . we suppose necessity and power to lie in the objects we consider, not in our mind, that considers them; notwithstanding it is not possible for us to form the most distant idea of that quality, when it is not taken for the determination of the mind, to pass from the idea of an object to that of its usual attendant. (T 1.3.14.25; SBN 167)

Scholars often express this claim in terms of projection: in Hume’s view, they say, we project our psychological determination to expect one event, given that another has taken place, onto the causally related events themselves. Hence, some scholars say that Hume holds a projectivist view of causal necessity (for example, see Beebee 2006).

Hume indicates that two basic functions of the inclusive imagination explain why we project our impression, or determination, onto the causally related events themselves. The first is association. Our impression or determination occurs at around the same time as the causally related events (in Hume’s language, it is contiguous to them in time), and it is caused by the first of these events—we are determined to expect motion in the second billiard ball because we see the first ball hurtling towards it. Because of this contiguity and this causal relation, the causally related events come to be associated with our impression or determination. The second basic function involved in projection is our propensity to complete the union of related objects: because the causally related events are temporally contiguous with, and cause, our impression or determination, our imagination tends to “feign” a relation of spatial contiguity between them as well (T 1.4.5.12; SBN 237–8). In other words, we complete the union between the causally related objects, on the one hand, and our internal impression or determination, on the other, by imagining that the internal impression occurs outside our mind, in the very place where the causally related events are located. That is to say, we project that internal impression onto those events.

Hume makes similar-sounding appeals to projection elsewhere in his philosophical works. For example, he writes that “taste”—the faculty which gives us our sentiments of aesthetic beauty and deformity, and of moral vice and virtue—“has a productive faculty, and gilding or staining all natural objects with the colours, borrowed from internal sentiment, raises in a manner a new creation” (second Enquiry, Appendix 1, paragraph 21). This leads some commentators to say that our aesthetic and moral evaluations involve projection, in Hume’s view: when we think of something as beautiful, or of someone as morally vicious, we are projecting our internal sentiments onto them. Hume does not explain how these aesthetic and moral kinds of projection occur. But, as he aims to explain human mental phenomena systematically, by appeal to a small number of basic principles, he is likely to explain them by means of the basic imaginative functions that he uses to explain why we project our internal impression of causal necessity.

Even among those scholars who agree that Hume gives projectivist theories of causation, morality, and aesthetics, there are disagreements about exactly what he understands projection to be, and what his projectivism implies. For example, some scholars think that projection is a kind of error that we make, while others think that projection need not involve any kind of error. For a second example, some scholars think that Hume’s projectivist theories of causal necessity and moral value are, in some sense, anti-realist—in other words, these theories imply that causal necessity and moral value are, in some sense, not real features of the world—while others think that his projectivism is consistent with a realist view of the projected features. For a helpful discussion of projection in general, and of Hume’s use of projection in particular, see Kail (2007).

The main texts that have inspired projectivist interpretations of Hume are Treatise Book 1, Part 3, Section 14, especially paragraphs 20–29; and Appendix 1 of the second Enquiry.

5. Fictions of the Imagination

a. Fictions in Hume’s Science of Man

Hume’s science of man aims to explain the most general beliefs and ways of thinking that we adopt in the course of ordinary life and in philosophical reflection. Often, Hume concludes that these beliefs and ways of thinking are not products of demonstrative or probable reasoning but, instead, are fictions produced by the exclusive imagination. According to him, the fictions that we form in ordinary (or “vulgar”) life include our belief in aggregates that have “unity” or oneness, such as one crowd, an aggregate of people; our belief that certain objects persist through time without changing; and our belief, of sensible objects like coffee cups, table, and chairs, that they continue to exist at times when nobody perceives them. Hume thinks that, in the course of philosophical reflection, we tend to form further fictions. For example, when we reflect philosophically on our sensory experiences, we come to believe that the only objects truly “present to” our minds are impressions and ideas, but that some of our impressions are caused by and represent external, material objects; Hume regards belief in these external, represented objects as a new fiction. As well as calling these beliefs fictions, Hume calls the distinctive imaginative process or operation that produces them fiction (for example, see T 1.2.3.11 and 1.4.2.29; SBN 37 and 200–201). This double use of the term “fiction” is in keeping with ordinary eighteenth century English usage. Johnson’s 1755–6 Dictionary of the English Language shows that, in eighteenth century English, the term “fiction” could mean both “the thing feigned or invented” (hence, Hume applies the term to certain ideas and beliefs) and “the act of feigning or inventing” (hence, Hume applies the term to the imaginative processes responsible for those ideas and beliefs).

Evidently, Hume thinks that many of our beliefs are fictions of the imagination. Fictions are so important within his science of man, one commentator suggests, that “what is commonly called Hume’s theory of impressions and ideas ought to be called the theory of impressions, ideas, and fictions” (Traiger 1987, 381). But it is hard to interpret Hume’s views about fictions. He suggests that fictions involve “apply[ing]” an idea to an object or impression from which we cannot derive that idea, and that this is an “improper” and “inexact” way of using that idea (T 1.2.3.11; SBN 37). But it is not clear what this means: in what sense do fictions involve an improper and inexact use of our ideas? Different commentators answer this question in different ways. According to some, Hume sees all fictions as falsehoods. According to others, he allows that some fictions may be true, but thinks that we lack evidence or justification for believing them. According to others still, he sees fictions as incoherent or unintelligible; if this is correct, then fictions may not be genuine ideas or beliefs, but pseudo-ideas or -beliefs, in Hume’s view. Of course, it is possible to combine these interpretations by distinguishing different kinds of fictions: for example, we may interpret Hume as thinking that some fictions are falsehoods, while others are unintelligible.

The rest of this section briefly examines three of the most important fictions that Hume discusses. It aims to exhibit the features of his discussions that motivate each type of interpretation that we have just surveyed. Hume’s discussions of these fictions are in Treatise Book 1, Part 4, Sections 2 and 3.

b. The “Vulgar” Fiction of a Continued Existence

In the Treatise section “Of scepticism with regard to the senses,” Hume tries to explain how we come to believe in bodies, or material objects, that continue to exist at times when nobody perceives them. He thinks that this belief can take two different forms: a “vulgar” or ordinary form, and a “philosophical” form. Hume thinks that the only things “present to the mind” are perceptions, or impressions and ideas (T 1.2.6.8, T Abs 5; SBN 67, 647). But ordinarily, he thinks, we do not realize this. Instead, we take certain of our sense-impressions to be bodies—that is, we ordinarily believe, of certain sense-impressions, that they continue to exist at times when they are not present to our minds (T 1.4.2.31, 1.4.2.36; SBN 202, 205). Hume calls this “vulgar” belief “the fiction of a continu’d existence” (T 1.4.2.36; SBN 205).

According to Hume, the main reason why we entertain this “vulgar” fiction is the “constancy” of certain sense-impressions before and after an interruption in our experiences (T 1.4.2.23; SBN 198–9). Suppose that I shut my eyes for a moment and that, upon re-opening them, I receive sense-impressions of the furniture in the room that closely resemble those that I received before shutting my eyes. Because of this resemblance or “constancy,” when I recall the earlier impressions, I naturally recall the later impressions, too: my mind “readily passes from one to the other,” due to the association of ideas of resembling objects. Thanks to a complicated imaginative mechanism, which Hume describes over several pages, this association of ideas leads me to imaginatively fill the gap in the sequence of sense-impressions that I received: I imaginatively construct ideas of furniture existing during the time when my eyes were shut, connecting up my memories of the last furniture-impression that I received before shutting my eyes and the first one that I received after re-opening them (T 1.4.2.31–40; SBN 201–8). Because these imaginatively constructed ideas are associated with memories, a high degree of force and vivacity is transmitted to them (T 1.4.2.41–42; SBN 208–9). Thanks to this mechanism, which involves both the association of ideas and the transmission of force and vivacity among related perceptions, I ordinarily come to believe, of my furniture-impressions, that they continued to exist while my eyes were shut.

However, Hume argues that none of our sense-impressions continue to exist at times when they are not present to our minds (T 1.4.2.44–45; SBN 210–11). When I shut my eyes, the furniture-impressions that were present to my mind cease to exist; when I re-open my eyes, new furniture-impressions are created in my mind, which are similar, but not numerically identical, to the earlier ones. So, the “vulgar” fiction of a continued existence is false, according to Hume. This is consistent with an interpretation on which Hume thinks that all fictions are falsehoods; however, it is also consistent with one on which Hume thinks that only some fictions are falsehoods, while others are unjustified beliefs or unintelligible pseudo-beliefs.

c. The Philosophical Fiction of Double Existence

Numerous Early Modern philosophers shared Hume’s view that only perceptions are ever “present to the mind,” but also held that we perceive bodies that continue to exist at times when nobody perceives them. These philosophers thought that we can perceive bodies by means of certain sense-impressions, because these impressions are caused by bodies, and represent those bodies to us. Hume calls this philosophical theory of how bodies are perceived the “opinion of a double existence of perceptions and objects” (T 1.4.2.46; SBN 211), because the theory posits two kinds of existent things: “perceptions,” or impressions that represent bodies to us; and “objects,” the bodies that are represented by (and in that sense are the “objects” of) our impressions.

Hume argues that this theory of double existence is “a new fiction,” due to the exclusive imagination, like the “vulgar” fiction that it replaces (T 1.4.2.52; SBN 215). However, he does not say that this new, philosophical fiction of double existence is false. Instead, he emphasizes that we cannot reason our way to believing it. We believe it only due to the exclusive imagination. This suggests that Hume regards the fiction of double existence as a belief that is unjustified, or inadequately supported by our evidence for it, but that may nonetheless be true.

d. The Philosophical Fiction of an Underlying Substance

Hume thinks that an ordinary sensible object, like a peach or melon, is just an aggregate of sensible qualities: for example, a ripe peach is an aggregate of a yellow-orange color, a fuzzy texture, solidity, and a sweet smell and taste (T 1.4.3.2, 1.4.3.5; SBN 219, 221). However, he thinks that we are prone to suppose otherwise. Instead of taking a peach to be an aggregate of many sensible qualities, we take it to be one thing. This leads to a kind of philosophical puzzlement: how can many things (the many aggregated qualities) also be one thing—isn’t this an “evident contradiction” (T 1.4.3.2; SBN 219)? According to Hume, many philosophers have responded to this puzzle by supposing that a peach is not the same thing as its sensible qualities, but is instead “an unknown something”—a substance or substratum that underlies its sensible qualities, and in which those qualities exist. The presence of this “unknown something,” underlying the sensible qualities, is what gives the peach “a title to be call’d one thing” (T 1.4.3.5; SBN 221).

Hume thinks that this underlying substance or “unknown something” is a fiction, characteristic of ancient philosophy (T 1.4.3.1, 1.4.3.5; SBN 219, 221). It is “feign[ed],” or postulated, by the exclusive imagination. Hume also calls this fiction an “unintelligible chimera” (T 1.4.3.7; SBN 222). Elsewhere, he explains the sense in which it is unintelligible. All of our ideas are copied from our impressions, or are made up of ideas that are so copied. But an underlying substance is supposed to be an entirely different kind of thing from an impression. So, we cannot form an idea of an underlying substance (T 1.4.5.3; SBN 232–3).

Does this mean that we cannot think about underlying substances at all? When Hume introduces the concept of an idea, he equates having ideas with thinking. This suggests that the answer is yes—the fact that we cannot form an idea of an underlying substance does mean that we cannot think about such substances at all.

However, other things that Hume says cast doubt on this interpretation. He seems to posit several different fictions that cannot be made up of ideas copied from impressions. For example, the “unintelligible” fiction of an underlying substance differs from the “incomprehensible” fiction of a perfect standard of equality (T 1.2.4.24; SBN 47–49). But how can entertaining one of these fictions differ from entertaining the other if, in each case, we have no thought at all about the thing that we are feigning, or fictitiously representing? Some commentators solve this puzzle by pointing to passages where Hume seems to distinguish two kinds of imaginative thought: conceiving and supposing (T 1.2.6.8–9, 1.4.2.56; SBN 67–68, 218). Hume seems to equate conceiving with forming ideas (T 1.2.2.8; SBN 32). This leaves open the possibility that supposing is a kind of imaginative thought that does not involve forming ideas. If this is Hume’s view, then he can allow that we can think about underlying substances and perfect standards of equality by making suppositions about them, even though we cannot conceive them or form ideas that represent them. For an interpretation of this kind, see Wilbanks (1968). For a helpful discussion of the problem posed by “unintelligible” fictions, and a creative solution, see Loeb (2002: chapter 5, esp. 162–72).

6. Imaginability and Possibility

Hume holds that whatever can be clearly (and, he sometimes adds, distinctly) conceived is possible. Scholars often call this his Conceivability Principle. Superficially, it resembles a principle that Descartes accepts: “everything which I clearly and distinctly understand is capable of being created by God so as to correspond exactly with my understanding of it” (Descartes, Sixth Meditation; CSM 2:54). But there are important differences between them. In Descartes’s view, our clearest and most distinct conceptions are due to our “pure understanding” or “pure intellect”—a faculty by which we form completely non-sensory, non-imagistic ideas—whereas “our sensory grasp” of things “is in many cases . . . obscure and confused” (Sixth Meditation; CSM 2:55); see section (2e) above. Therefore, in his Meditations, Descartes aims to help his readers achieve clear and distinct conceptions of the soul and the body by leading their minds away from the senses and imagination (as he explains in the Synopsis to the Meditations). In contrast, Hume writes that our impressions—the perceptions that our internal and external senses present to our minds—are “clear and evident” (T 1.2.3.1; SBN 33). And he equates clear and distinct conceivability with imaginability, as this passage makes clear:

’Tis an establish’d maxim in metaphysics, that whatever the mind clearly conceives includes the idea of possible existence, or in other words, that nothing we imagine is absolutely impossible. (T 1.2.2.8; SBN 32, italics in original)

Unlike Descartes’s principle, then, Hume’s Conceivability Principle means that whatever can be clearly (and distinctly) imagined is possible. (Hume does not specify whether he has the inclusive or exclusive imagination in mind. Likely, he thinks that any clear idea formed in the inclusive imagination—be it by reason, or by the exclusive imagination—represents something that is possible.)

Hume uses his Conceivability Principle as a premise in several of his most important arguments. For example, he uses it to argue that the proposition “whatever has a beginning has also a cause of existence” is “neither intuitively nor demonstrably certain” (T 1.3.3.2–3; SBN 79–80). Only necessary truths—truths that could not have been false—are intuitively or demonstrably certain. But, Hume claims, we can clearly imagine something starting to exist without a cause. Together with his Conceivability Principle, this implies that it is possible that something should start to exist without a cause. It follows that the proposition “whatever has a beginning has also a cause of existence” is not a necessary truth (it could have been false); hence, it is neither intuitively nor demonstrably certain. Hume argues in a similar way, using his Conceivability Principle, that no demonstrative argument can prove that nature is uniform (T 1.3.6.5; SBN 89), and that we cannot conceive of a real “necessary connexion” between a cause and its effect (T 1.3.14.13; SBN 161–2).

Hume’s claims about imaginability and possibility raise two main interpretive questions. First, whose ability clearly to imagine something guarantees that it is possible: an ordinary human being, like you or me, or an ideal human being, whose mind is in perfect working order and who has a large stock of simple ideas to use in forming her clear and distinct conceptions? Second, in addition to accepting that whatever can be clearly imagined is possible, does Hume also accept that whatever cannot be clearly imagined is impossible? This “Inconceivability Principle” seems indefensible to many philosophers, but one paragraph in the Treatise suggests that Hume nonetheless accepts it (T 1.2.2.8; SBN 32).

In his Essays on the Intellectual Powers of Man, Hume’s contemporary Thomas Reid presented several objections to the principle what whatever is conceivable is possible (Essay 4, Chapter 3). He regards Hume as one of the targets of these objections. However, Reid agrees with Hume that we cannot distinctly imagine the impossible (Essay 5, Chapter 6). The disagreement between them really concerns whether there is a form of clear conception other than clear imagining: Hume thinks that there is not; Reid thinks that there is, and that this non-imaginative form of conception allows us clearly to conceive impossibilities.

7. The Imagination and Hume’s Skepticism

In the concluding section of Treatise Book 1, Hume professes himself “a sceptic” (T 1.4.7.15; SBN 274). In his “Abstract” of the Treatise, he describes the philosophy that it contains as “very sceptical” (T Abs 27; SBN 657). In the first Enquiry, he presents what he calls “sceptical doubts about the operations of the understanding” (E 4) and a “sceptical solution of these doubts” (E 5); and he concludes this work by endorsing what he calls “mitigated scepticism” (E 12.24–34; SBN 161–5). The questions of what Hume’s skepticism consists in, and whether this skepticism is compatible with his program of establishing a “science of man,” are some of the most central—and most contested—questions in Hume scholarship.

Skepticism appears in the titles of two Treatise sections and three sections of the first Enquiry. In four of these five sections, Hume argues that “reason” cannot carry out a certain function, and that this function therefore falls to “the imagination.” In the Treatise section “Of scepticism with regard to reason” (T 1.4.1), he argues that our faculty of reason cannot explain why we believe the conclusions of our reasoning; were it not for a feature of the imagination, our confidence in our conclusions would be destroyed by the thought of our own fallibility (T 1.4.1.9–12; SBN 184–7). In “Of scepticism with regard to the senses” (T 1.4.2), he argues that reason cannot explain how we come to believe in the continued and distinct existence of sensible objects, at times when nobody perceives them (T 1.4.2.14; SBN 193). Instead, that belief must derive from the imagination and its fictions (T 1.4.2.15–55; SBN 194–217); for discussion, see sections (5b) and (5c), above. In the Enquiry sections “Sceptical doubts concerning the operations of the understanding” (E 4) and “Sceptical solution of these doubts” (E 5), he argues that our beliefs about unobserved things are “not founded on reasoning, or any process of the understanding” (E 4.15; SBN 32) and, instead, that these beliefs are due to the association of perceptions in the imagination (E 5.11, 5.14; SBN 48, 50–51).

This suggests that Hume’s skepticism has something important to do with the demotion of reason, and the promotion of the imagination, as explanatory factors in his science of man. In order to determine exactly what this skepticism consists in, we must determine what Hume means by the terms “reason” and “the imagination” in the sections of his works that present his skeptical arguments. Scholars are divided on this matter, and the rest of this section briefly surveys the main interpretive issues. It focuses on Hume’s claim that our beliefs about the unobserved are not founded on reasoning, but are due to imaginative association; hereafter, let us call this “Hume’s Skeptical Claim.” (Those seeking to interpret Hume’s claims about reason and the imagination in his other skeptical arguments will face issues similar to the ones discussed here.)

One issue is whether, in Hume’s Skeptical Claim, the terms “reason” and “the imagination” express Hume’s own distinction between the two parts or sub-faculties of the inclusive imagination: reason, understood as the sub-faculty responsible for demonstrative and probable reasoning; and the exclusive imagination, understood as the sub-faculty responsible for whimsies, prejudices, and various fictions (T 1.3.9.19n22; SBN 118; for discussion, see sections (2a)–(2c), above). Some scholars think that these terms do express Hume’s distinction” (in keeping with the wording of this paragraph’s first sentence). If they are right, then Hume’s Skeptical Claim means that our beliefs about the unobserved are not founded on demonstrative or probable reasoning. (This might make Hume’s views seem paradoxical, because he often says that our beliefs about the unobserved are produced by probable reasoning—in fact, he says that the mental process responsible for these beliefs is “not only a true species of reasoning, but the strongest of all others,” [T 1.3.7.5n20; SBN 97n]. But these scholars will interpret Hume’s phrase “founded on” in such a way that beliefs can be produced by probable reasoning without being founded on probable reasoning.)

Other scholars think that Hume’s Skeptical Claim does not concern his distinction between reason and the exclusive imagination, but some other distinction. On one version of this proposal, Hume means to contrast reason, as his opponents conceived it with the inclusive imagination, as he conceives it. Hume’s opponents thought that reasoning involved mental events or processes that are both rational and explanatorily basic; see section (2e) above. Perhaps Hume’s Skeptical Claim means that reason, conceived in his opponents’ way, cannot explain our beliefs about unobserved things; hence, these beliefs must instead be explained by the inclusive imagination—specifically, by the sub-faculty of the inclusive imagination by which we carry out demonstrative and probable reasoning.

If these scholars are correct, then we face a second interpretive issue: exactly what conception of reason is at stake in Hume’s Skeptical Claim? Scholars have made numerous proposals, including a “rationalist” conception of reason as deduction; a Lockean conception of reason as the finding of “intermediate ideas”; and reason as a faculty of rational perception, which encompasses sensation and intuition, as well as reasoning.

A third interpretive issue is whether Hume’s Skeptical Claim is supposed to be a normative claim—that is, a claim involving some evaluation of our beliefs about the unobserved—or a purely descriptive claim about the mechanism that produces these beliefs. According to a traditional interpretation, Hume’s Skeptical Claim is normative, and it means that we have no justification at all for our beliefs about the unobserved: hence, that none of these beliefs is in better standing than any other. On this traditional interpretation, then, Hume understands the imagination to be a source of completely unjustified beliefs. (Whether this applies to the inclusive imagination, or just to the exclusive imagination, will depend on how we settle the first interpretive issue, above.) This interpretation was popular in the mid-twentieth century but, since the 1970s, it has been subject to numerous challenges and is now a minority view. Perhaps the most serious challenge to it is that Hume endorses some beliefs about the unobserved, while criticizing others: for example, he endorses his own claim that all simple ideas—even those that he has not observed—are copied from simple impressions; but he criticizes beliefs in miracles (E 10). It is not clear how he could do this, if he thought that all beliefs about the unobserved were equally devoid of justification.

For this and other reasons, some late twentieth and early twenty-first century scholars argue that Hume’s Skeptical Claim is a purely descriptive claim about the mental process by which we form these beliefs, with no implications about our justification for them. Others argue for an intermediate interpretation, which says that Hume’s Skeptical Claim is normative, but does not completely rule out all forms of justification for our beliefs about the unobserved. For example, according to some, Hume means to say that our beliefs about the unobserved are not justified by means of rational insight, while allowing that certain of these beliefs might be justified by some other means. These purely descriptive and intermediate interpretations both allow that the imagination may be a source of justified beliefs, in Hume’s view. (Again, whether the relevant sense of “imagination” is inclusive or exclusive depends on how we settle the first interpretive issue, above.)

8. References and Further Reading

a. Primary Sources by Hume

Hume’s theory of the imagination informs much of his thinking about human mental and social phenomena, so almost all of his works are relevant to this theory. The most relevant are:

Hume, David. A Treatise of Human Nature, Vol. 1: The Text. Ed. by David Fate Norton and Mary J. Norton (Oxford: Clarendon Press, 2007). [Cited as “T,” followed by book, part, section and paragraph numbers, followed by corresponding page number in L. A. Selby-Bigge and P. H. Nidditch’s 1978 edition of the Treatise, set off by “SBN.”]
- In this early work, first published 1739–40, Hume develops a theory of the imagination and uses it to explain an enormous range of cognitive, passionate, and social phenomena.
Hume, David. An Enquiry Concerning Human Understanding. Ed. by Tom L. Beauchamp (Oxford: Clarendon Press, 1998). [Cited as “E,” followed by section and paragraph numbers, followed by the corresponding page number in L. A. Selby-Bigge and P. H. Nidditch’s 1975 edition of Hume’s Enquiries, set off by “SBN.”]
- Recasts some of Treatise Book 1, including its accounts of the origins and association of ideas, and its central arguments about probable reasoning and causation.
Hume, David. A Dissertation on the Passions; The Natural History of Religion. Ed. by Tom L. Beauchamp (Oxford: Clarendon Press, 2007).
- The Dissertation recasts Hume’s discussion of the passions from Treatise Book 2, including his theory of the “double association” of impressions and ideas. The Natural History of Religion uses Hume’s theory of the imagination to explain how human beings first came to have religious beliefs.

b. Primary Sources by Other Early Modern Philosophers

The following Early Modern works are especially helpful to read in connection with Hume’s theory of the imagination:

Descartes, René. The Philosophical Writings of Descartes. Two vols. Trans. and ed. by John Cottingham, Robert Stoothoff, and Dugald Murdoch (New York: Cambridge University Press, 1985). [Cited as “CSM,” followed by volume and page number]
- For Descartes’s early views on the imagination, see the Rules for the Direction of the Mind, especially Rule 12 (CSM 1:39–51). For his mature views, see especially The Passions of the Soul, Part 1, articles 19–21 (CSM 1:335–6), and the Second, Fifth and Sixth Meditations (all in CSM 2).
Descartes, René. The Philosophical Writings of Descartes, Volume III: The Correspondence. Trans. and ed. by John Cottingham, Robert Stoothoff, Dugald Murdoch, and Anthony Kenny (New York: Cambridge University Press, 1991). [Cited as “CSMK,” followed by page number.]
- Several of Descartes’s letters clarify his views of the imagination in helpful ways. See, especially, his letters to Mersenne of 16 June, 1641 (CSMK 183–4), July 1641 (CSMK 184–7) and 22 July, 1641 (CSMK 187); his letter to “Hyperaspistes” of August 1641 (CSMK 188–97); and his letter to Princess Elizabeth of 28 June, 1643 (CSMK 226–9).
Hobbes, Thomas. Leviathan. In The English Works of Thomas Hobbes, vol. 3. Ed. by Sir William Molesworth (London: John Bohn, 1839).
- Hume’s views of the imagination are likely indebted to Hobbes. See especially chapters 2 and 3 of Leviathan.
Malebranche, Nicolas. The Search after Truth. Trans. and ed. by Thomas M. Lennon and Paul J. Olscamp (New York: Cambridge University Press, 1997).
- Malebranche presents his views of the imagination in Book 2 of The Search after Truth. Book 2, Part 1, Chapter 5; and Book 2, Part 2, Chapter 2 provide especially helpful background to Hume’s views about the imagination—especially his theory of association.
Reid, Thomas. Essays on the Intellectual Powers of Man. Ed. by Derek Brookes (University Park: Pennsylvania State University Press, 2002).
- Hume’s contemporary Thomas Reid criticized Hume’s theory of the imagination on numerous fronts. Many of his criticisms are contained here. On Hume’s theory of the imagination, see especially Essay 3, Chapter 7 (“Theories Concerning Memory”), which criticizes Hume’s way of distinguishing memory and imagination; Essay 4, Chapter 2 (“Theories Concerning Conception”), which criticizes Hume’s views of the role that representational ideas play in imagining, as Reid understood them; Essay 4, Chapter 3 (“Mistakes Concerning Conception”), which criticizes the view that whatever is conceivable is possible; and Essay 5, Chapter 6 (“Opinions of Philosophers about Universals”), which criticizes Hume’s view of abstract ideas and the arguments he gives in support of it.

c. Secondary Sources

The secondary literature on Hume is enormous and, because his theory of the imagination is so central to his science of man, much of the literature is relevant to it. Here is a selection of especially relevant and helpful contributions.

Abramson, Kate. “Sympathy and the Project of Hume’s Second Enquiry.” Archiv für Geschichte der Philosophie 83.1 (2001): 45–80.
- Helpfully discusses the relationship between Hume’s views on sympathy in the Treatise and those in the second Enquiry. Presents a convincing case that the Enquiry uses the same imaginative mechanisms as the Treatise to explain our moral sentiments.
Beebee, Helen. Hume on Causation (New York: Routledge, 2006).
- Contains clear and insightful discussions of Hume’s views on demonstrative and probable reasoning, as well as on causation. Develops and defends a projectivist interpretation of his theory of causation.
Dorsch, Fabian. “Hume.” In The Routledge Handbook of Philosophy of Imagination, edited by Amy Kind, 40–54 (New York: Routledge, 2016).
- Stimulating critical discussion of Hume’s theory of the imagination and its significance for twentieth- and twenty-first-century philosophy of mind. Sketches a revised, “neo-Humean” theory of the imagination, designed to meet several objections to Hume’s own views.
Everson, Stephen. “The Difference Between Feeling and Thinking.” Mind 97.387 (1988): 401–13.
- Very clear and illuminating discussion of the properties of force and vivacity. Argues that these are functional properties of perceptions, not phenomenological ones.
Garrett, Don. Cognition and Commitment in Hume’s Philosophy (New York: Oxford University Press, 1997).
- Essential reading on Hume’s faculty psychology in general, and his theory of the imagination in particular. Chapter 1 helpfully situates Hume’s views on the imagination in relation to those of his Early Modern predecessors.
Garrett, Don. “The Literary Arts in Hume’s Science of the Fancy.” Kriterion 44.108 (2003): 161–79.
- Discusses the role of Hume’s theory of the imagination in his attempt to found a science of aesthetic criticism. Read together with the opening chapters of Cognition and Commitment in Hume’s Philosophy, this makes an excellent introduction to Hume’s theory of the imagination.
Grene, Marjorie. “The Objects of Hume’s Treatise.” Hume Studies 20.2 (1994): 163–77.
- Helpfully surveys the ways in which Hume uses the term “object” (as in his Separability Principle, that whatever objects are distinct are distinguishable, and that whatever objects are distinguishable are separable by the thought and imagination). Identifies three senses of “object” in the Treatise: i) targets of attention, or intentional objects; ii) impressions; iii) non-mental, external objects.
Kail, P. J. E. Projection and Realism in Hume’s Philosophy (New York: Oxford University Press, 2007).
- This is a book-length study of Hume’s account of projection and his use of this imaginative function to explain belief in the external world, religious belief, belief in causal necessity, and moral belief. Contains much helpful discussion of the imagination and its relation to our other cognitive faculties.
Loeb, Louis E. Stability and Justification in Hume’s Treatise (New York: Oxford University Press, 2002).
- A novel and provocative interpretation of Hume’s epistemology. Contains much insightful discussion of his associationist theories of probable reasoning and sympathy, and of imaginative fictions.
Lightner, D. Tycerium. “Hume on Conceivability and Inconceivability.” Hume Studies 23.1 (1997): 113–32.
- Argues that Hume’s Conceivability Principle concerns our actual ability to conceive, given our actual stock of simple ideas; and that Hume does not accept the Inconceivability Principle.
Owen, David. Hume’s Reason (New York: Oxford University Press, 1999).
- Essential reading on Hume’s attempt to explain reasoning in terms of more basic imaginative functions; also includes helpful discussions of Descartes’s and Locke’s theories of reasoning, and Hume’s relationship to them.
Traiger, Saul. “Impressions, Ideas, and Fictions.” Hume Studies 13.2 (1987): 381–99.
- Influential discussion of Hume’s view of imaginative fictions; subsequent discussion of this important topic is largely indebted to this article.
Traiger, Saul. “Hume on Memory and Imagination.” In A Companion to Hume, edited by Elizabeth S. Radcliffe (Malden, MA: Wiley-Blackwell, 2011).
- Helpful introductory discussion of Hume’s distinction between the memory and the inclusive imagination. Raises several questions about this distinction, and surveys several interpretations of it.
Wilbanks, Jan. Hume’s Theory of Imagination (The Hague: Martinus Nijhoff, 1968).
- Devoted to Hume’s theory of the imagination. Contains a helpful survey of older interpretations. Argues that Treatise Book 1 and the first Enquiry each presents an overall argument for mitigated skepticism, in which Hume’s theory of imagination plays a central role.
Wright, John P. The Sceptical Realism of David Hume (Minneapolis: University of Minnesota Press, 1983).
- Contains much valuable discussion of Hume’s theory of the imagination and its role in his accounts of causation and the external world. Especially helpful on Hume’s debt to Malebranche’s theory of the imagination.

Author Information

Jonathan Cottrell
Email: jonathan.cottrell@wayne.edu
Wayne State University
U. S. A.

John Anderson (1893-1962)

Scottish-Australian philosopher John Anderson was a passionate defender of a philosophy typically described as Realism. Anderson exercised a significant and lasting influence over several generations of students, including such later philosophers as John Passmore, J.L. Mackie, and D.M. Armstrong. These students criticised and developed several key features of Anderson’s own philosophy such as the defense of a theory of objective good, the rejection of any theory of absolute morality or imperative, and a logically rigorous approach to metaphysical questions such as causation.

The significance of this influence was primarily due to Anderson’s systematic conception of philosophy as a Realist theory of logic, ethics, and aesthetics; this was unusual because it was in a century dominated by the analytic method in philosophy. Anderson’s systematic conception of philosophy sought to provide a unified theory of the traditional subjects of philosophy freed from their association with Platonism and Idealism.

Hence Anderson advanced a doctrinal conception of metaphysics as composed of three distinct elements: an epistemology of Realism as the direct experience of things, an ontology of Empiricism as situations of objects existing independently of the relations they have, and the logical theory of Positivism as the theory that terms and propositions always refer to existing objects or situations. Further, Anderson also developed the important ethical and aesthetic theory that ‘good’ and ‘beauty’ can never be in terms of the relations that they have (for example, being valued, being obliged, being striven after, ‘expressing’ etc.) and must be qualities of natural human activities.

One of the most controversial aspects of Anderson’s ontology was his logical analysis of Space-Time and its refutation of ‘physicalist’ (that is, general relativity) assumptions about the nature of space-time. Put briefly, space-time (in Einstein’s sense) cannot have an origin or a boundary because beyond that limit or origin, there must always be more Space-Time (in Anderson’s sense). Associated with this view of existence was his theory of the categories. Developing the ideas of Samuel Alexander, Anderson argued that there were thirteen categories that are universal to all things.

Introduction
Early Systematic Realism: 1926-1937
Mature Philosophy 1939-1962
Conclusion
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. Introduction

John Anderson was born in 1893 in the small town of Stonehouse, twenty miles south-west of Glasgow, Scotland. He attended the local school, where his father was headmaster, before enrolling at the Hamilton Academy in 1907. In 1911, he won first place in the All Scotland Bursary Competition which enabled him to study at Glasgow University from 1911 to 1917. During this time he won several prizes and awards, before graduating with an M.A. degree in philosophy in 1917. From 1917 to 1922 he worked in the philosophy departments at Glasgow University and the University of Wales, before being employed at Edinburgh University on the Shaw Fellowship from 1922 to 1926. In 1927, he was appointed Professor of Philosophy at Sydney University and, apart from a sabbatical year spent in Britain and America in 1938, he remained there until his retirement in 1958. He died in Sydney in 1962 at the age of 69.

Anderson was initially influenced by the Hegelian Idealism established at Glasgow University by Edward and John Caird and further developed by Anderson’s own teacher, Henry Jones. The distinctive form of Hegelianism developed at Glasgow University is typically described as Christian Idealism, as it emphasized the dialectical development of an organic universe culminating in the realization of God. Anderson was also influenced by other Scottish philosophers such as Robert Adamson and John Burnet, although it was the Australian philosopher, Samuel Alexander, who had the most significant and permanent impact on his philosophical development. Anderson had already been exposed to the Realist writings of the early Moore and Russell, and the work of William James and the American New Realists before he attended Alexander’s Gifford lectures at Glasgow University between 1915 and 1917. These lectures, later published as Space, Time and Deity, were to exercise such a decisive influence on Anderson that he was still lecturing on them thirty years later. [Anderson 2005, 2007i] From within the history of philosophy itself, the most significant influences on him were Heraclitus, Socrates, Plato, and Hegel. He often praised Greek objectivity with its emphasis on ‘things’ and thought that almost all of modern philosophy, with its emphasis on epistemology, was irrelevant to the systematic study of philosophy, with only Hegel and Alexander being singled out for praise from the entire modern period. [Anderson 2008i, 2008ii]

Anderson’s intellectual development was also influenced by a range of non-philosophical thinkers including James Joyce, Mathew Arnold, Sigmund Freud, Karl Marx, Georges Sorel, Giambattista Vico, Henrik Ibsen, Fyodor Dostoevsky, and Herman Melville. He was particularly well read in the Marxist and Communist literature of the early twentieth century, being familiar with the works of Engels, Lenin, Trotsky, Bukharin, Kautsky, and Stalin. While these various political and cultural influences cannot be said to fit neatly into Anderson’s overall philosophic system, his interest in these thinkers meant that his students gained an education that was much more ‘continental’ in scope than was typical at other English-speaking universities around the world at that time. In a very real sense, Anderson’s students received an education that was unique. This diverse intellectual and cultural context, when coupled with the logical discipline of his systematic philosophy, enabled many of his students to succeed in a variety of academic and non-academic occupations.

In philosophy, his students included John Passmore, J.L. Mackie, and D.M. Armstrong, whose collective work in metaphysics, ethics, and the history of philosophy, contributed greatly to the development of philosophy during the last quarter of the twentieth century. [Armstrong 2001, Passmore 1997] Apart from these luminaries, Anderson’s students who pursued careers in academic philosophy number more than thirty and comprise more than a dozen professors. These include Alec Ritchie (Professor of Philosophy), David Stove (Associate Professor of Philosophy), A.J. Baker, Bill Doniela (A/Professor of Philosophy), Sandy Anderson, Ruth Walker, Kim Lycos, Margaret Mackie, Gaius MacIntosh, George Molnar (author, Powers: a Study in Metaphysics), Alan Olding (author, Modern Biology and Natural Theology), Perce Partridge (Professor of Philosophy), Tom Rose (Associate Professor of Philosophy), Vic Dudman, Robert McLaughlin, Eric Dowling, Graham Pont, and David Dockrill.

Anderson’s students also succeeded to a number of professorial positions outside philosophy. These include Eugene Kamenka (Professor, History of Ideas), L.J. Hume (Reader, Political Science), Les Hiatt (Professor of Anthropology), Donald Horne (author, The Lucky Country), Doug McCallum (Professor of Political Science), Neil McInnes (author, The Western Marxists), Wesley Milgate (Professor of English), Bill Morison (Professor of Law), H.J. Oliver (Professor of English), Bill O’Neil (Professor of Psychology), John Ward (Professor of History), Brian Beddie (Professor of Government), Hedley Bull (Professor of International Relations), P.D. Craig (Professor of Chemistry), Rawdon Dalrymple (Professor of Government), J.A.B. Holland (Professor of Divinity), A.D. Hope (Professor of English), James McAuley (Professor of English), and R.F. Jackson (Professor of French). Studying under Anderson also enabled a great number of his students to succeed in the commercial, journalistic, and political spheres. He is widely regarded as the ‘father’ of the Sydney Push, the amorphous social and political movement that produced such significant public intellectuals as Clive James, Germaine Greer, Robert Hughes, and Barry Humphries.

Anderson’s tenure at Sydney University was a controversial one and he was twice subject to public censure: once by the Sydney University Senate, and once by the NSW Parliament. He participated in many student societies at the University and was president of the Sydney University Freethought Society for over twenty years. He was also a passionate defender of freedom of speech and thought, a critic of all forms of censorship, and an active participant in several radical political movements during his early years.

Since his death, Anderson’s philosophy has been little studied and, in general, is misunderstood. A significant reason for this state of affairs is that he only wrote journal articles during his lifetime and the greater majority of these were for the Australasian Journal of Psychology and Philosophy, which, at that time, was not widely read internationally. He was working on the publication of a collection of a selection of these articles at the time of his death and these were published posthumously as Studies in Empirical Philosophy. Over the next forty years, only two other collections of his work were published – Art and Reality dealing with aesthetics and literary criticism and Education and Inquiry dealing with education. The creation of the position of the John Anderson Research Fellow at Sydney University led to the publication of a number of editions of his lectures from 2003 to 2008, but that position has been vacant since 2008. It is not known when, or if, that position will be filled again so further publication of Anderson’s philosophy can occur.

Another difficulty in properly understanding Anderson’s philosophy is the spread of his philosophical writing during his academic career. During his first ten years at Sydney University (1927-1937), he wrote 75% of all his published articles in philosophy. Hence for the remaining twenty four years of his life (1938-1962) he wrote only 25% of his published work. This lack of publication after 1938 makes assessment of his mature philosophical position very difficult. While an irregular spread of publication over a lifetime might not always present a problem for understanding a philosopher’s theories, in Anderson’s case it is significant because in his political writings, where the same irregular publication occurs, there is a radical change in his political position: from a radical Communist in 1927 to conservative anti-Communist in 1962. He described his political writings for the decade from 1927 to 1937 as his ‘proletarian’ period, as at this time he adhered to a general Communist theory of society and political change. After 1938, he entered his ‘anti-proletarian’ period of political theorizing which was characterized by an emphasis on liberty, democracy, the importance of traditions in political life, and anti-Communism.

Given these marked changes in his political position, the question is raised whether Anderson’s philosophic views also went through any substantial changes during his lifetime. To adequately assess this claim, it is necessary to present the systematic character of Anderson’s philosophy as formulated during his first decade at Sydney University. It is during this period that Anderson’s ‘systematic Realism’ is most in evidence. The systematic nature of this philosophy can then be contrasted with the more episodic writings of the last twenty years of his life to assess whether any pronounced changes occurred in his understanding of philosophy.

2. Early Systematic Realism: 1926-1937

The most distinctive feature of John Anderson’s philosophy is that it was a systematic philosophy in a century dominated by the analytic methodology in Anglo-Saxon philosophy. System building of the sort Anderson engaged upon was frowned upon by analytic philosophers, whether of the conceptual or linguistic variety. In contrast, continental philosophers at this time were much more at ease with systematic philosophy, although the phenomenological and existential orientation of that philosophy did not fit easily with Anderson’s scientific conception of philosophy. As early as 1922, Anderson was describing his systematic philosophy as a unified theory of the ‘sciences’ of logic, ethics, and aesthetics and this characterization of his philosophy is one that he maintained into the 1930s. For Anderson, to assert that logic, ethics, or aesthetics were sciences was to assert that they were definite subjects that could be studied and that their method of study was as objective as any other science.

a. Realism

The most common term that has been used to characterize Anderson’s philosophy is that of Realism, hence the common title ‘Andersonian Realism’. During the twentieth century, Realism was primarily understood as an epistemological term that refers to the ‘real’ or ‘objective’ existence of the object of knowledge. However for Anderson, Realism was not merely an epistemological doctrine, but, based on the theory of external relations developed by the American New Realists, was a systematic enterprise – a ‘systematic Realism’ – that treated the subjects of logic, ethics, and aesthetics in a Realist manner. It should be noted that Anderson’s understanding of the term ‘logic’ was not restricted to its contemporary usage of formal logic but was more of the traditional sense of metaphysics as including epistemology, ontology, and formal logic as distinct disciplines within that subject.

b. Realist Metaphysics

In an early article, “Realism and Some of its Critics’, Anderson outlined the logic he believed that his general metaphysical system followed. He argued that Realism appears firstly as an epistemology based on the doctrine of external relations. Secondly, it appears as an ontology, which he described as Empiricist, but is more properly described as a theory of situations or spatio-temporal existence. Finally it appears as a logic which he described as Positivist, meaning by that term, not the Logical Positivism common during the 1930s, but the more general view that logic is a positive and not a relativistic subject.

i. Realist Epistemology

Twentieth century Realism as first articulated by Moore, Russell, and others referred to either an epistemology of direct or indirect (representational) knowledge. For Anderson, such a distinction was false: Realism could only be an epistemology of the direct knowledge of the object. Anderson’s Realist epistemology was the view that in any relationship of knowledge, there are three distinct parts: a subject of knowledge or ‘knower’ (the –er); an object of knowledge or ‘known’ (the –ed); and the relation of knowing itself, the knowing (the –ing). On a Realist analysis, such a relationship had the logical form ‘S/r/O’; further, each part of the relationship – the S, r, O, or -er, -ing, -ed – was distinct from the other and could not be reduced to any other part. On this account, the relation of knowing between the subject and the object must be immediate and direct. Further, following the American New Realists, Anderson argued that the logic underlying such an epistemology was the doctrine of external relations, viz. in any relationship a/R/b, the terms of the relationship ‘a’ and ‘b’ exist independently of each other and of the relation ‘R’ between them. On Anderson’s view, to attempt to identify the relations a thing has with the things themselves is to commit the error of relativism. On this view, any attempt to identify or reduce the relation of knowing to the subject, or the object, is to render knowledge of either impossible as it fails to maintain the distinctness of the various parts of the relationship of knowledge. A further implication of this view for Anderson is that the qualities of either the subject or object cannot be constituted by the relations that it has. There is, then, an absolute distinction between qualities and relations. Again, to attempt to identify or reduce the qualities that a thing has with the relations that the qualities have is to commit the error of relativism. This criticism of relativism is the most common technique used in the exposition Anderson’s systematic Realist philosophy.

This conclusion – that an object, or its qualities, cannot be constituted by the relations that it has – had several interesting implications for Anderson’s philosophy of mind. Firstly, in terms of his epistemological theory, it implied that ‘consciousness’ – understood as both a quality of mind and a relation that mind has – cannot exist. There is, in brief, no such thing as ‘consciousness’. Further, since being conscious is a relation, then what is conscious must be some other quality of mind and since volition or will is also a relation, then it cannot be the required quality of mind. For Anderson, the only possible quality of mind that could be aware or conscious of the things around it, had to be emotion, feeling, or affect. It is emotions which know and it is emotions which act. This is Anderson’s theory of ‘mind as feeling’. This view appears to square with at least two important facts we know about the mind from Freudian psychoanalytic theory: firstly, that the mind, while unconscious, that is, asleep, can be active, for example, dreaming; secondly, that while the mind is conscious, there can be conflicting emotions operating at the same time and this can account for such mental features as slips of the tongue – one emotion is conscious in wanting to say something, but another emotion actually says what it needs to. While such a theory appears to be quite suggestive as a Realist theory of mind, it is unfortunate that Anderson never developed a detailed theory of which emotions constitute mind and how they operate.

ii. Empiricist Ontology

Anderson, following Alexander, described his ontological theory of existence as Empiricism, although, quite clearly, this is not the position of the British Empiricists, a theory he regarded as idealistic. His own understanding of Empiricism was a theory of situations or occurrences, where a situation is an occurrence in Space-Time and, as such, is characterized by various categories of existence. Anderson argued that there are three important associated doctrines associated with Empiricism: pluralism, determinism, and objectivism.

Anderson’s pluralism is the view that any thing is both a particular and a universal. That is, any thing is a universal that is composed of various things which together constitute it, and is a particular and therefore part of a thing greater than itself. In other words, there are no absolute or pure universals and no absolute or pure particulars. This is the foundation of Anderson’s theory of infinite complexity whereby there is no indivisible ‘atom’ from which all things are made, and no unrelatable totality – no ‘universe’ – which has nothing outside of it. Hence Anderson distinguished his position from the theories of monism and atomism understood as theories of logical totalities and logical simples.

Anderson’s determinism is the view that every thing or occurrence is caused. Since every thing is an occurrence in Space-Time, then every thing will have causal relations acting on it, and causal relations from it acting on other things. Hence Anderson opposed any theory of indeterminism and rejected the notion of a ‘free will’ as something outside the universe of spatiotemporal causality. Anderson’s objectivism is the view that any subject must also be an object. That is, he rejected the doctrine of subjectivism viz. that there things that are ‘irreducibly’ subjective. In terms of mind, the alleged ‘subjectivity’ of the mind is as objective as any other thing, that is, the mind as a subject, as a knower, is an existing thing with all the categorical features that other things possess.

iii. Positivist Logic

Anderson’s logical position is one he described as Positivism, although this is not to be confused with Logical Positivism, as Anderson believed that experimentation was an inadequate test for the truth or falsity of propositions. He believed, for example, that love is not something that can be studied experimentally. Anderson’s Positivism was simply a theory of the positive truth and falsity of propositions and therefore opposed to any theory which postulates that the context of a proposition or judgment determines its truth or falsity. Hence he rejected the Absolute Idealist view that it is the ‘Absolute’ which determines the truth of a judgment as well as the relativist position expressed by F.C.S. Schiller that it is the particular context of judging that determines the truth of a judgment. Judgements, or in Anderson’s terms, propositions, are true or false independently of any context at all. Their truth or falsity is determined simply in terms of things themselves. This theory of absolute truth or falsity implied the denial on any theory such as the Hegelian dialectic where logic ‘develops’ through the resolution of a logical contradiction into a ‘higher truth’.

Anderson accepted the traditional Aristotelian analysis of the proposition viz. that it is comprised of a subject, a predicate, a copula, and the quantifiers. The most general form of the proposition is ‘S is P’ where S is the subject function and P is the predicate function, while the copula ‘is’ incorporates both the positive and negative formulations, ‘is’ and ‘is not’. When the universal and particular quantifiers, ‘All’ and ‘Some’, are introduced, this yields the classical four forms of the proposition: All S are P (SaP), Some S are P (SiP), All S are not P (SeP), and Some S are not P (SoP).

However what distinguished Anderson’s treatment of the proposition from the traditional Aristotelian analysis was his insistence that the terms of the proposition must only refer to existing things and that the function of the copula was to indicate whether something was or was not the case, or, as he expressed it, was ‘an issue’. Hence he refused to admit fictional entities as terms in his logic and rejected non-existential uses of the copula such as the copula of identity and the copula of predication. So, any proposition must have a real or existing thing as a term in the subject function, a real or existing thing as a term in the predicate function, and the copula ‘is or is not’ either attributing the occurrence of the predicate to the subject, or not, as the case may be. In other words, a proposition expresses the occurrence of a predicate attributed to a subject in a particular situation located in Space-Time.

Another term used to describe Anderson’s Positivism was ‘Propositional Realism’. This expression is most typically used to describe the view that reality is propositional in nature. This is Anderson’s controversial identity theory of the proposition. Briefly stated, this is the view that any occurrence or situation is identical any proposition asserted about it. Anderson had rejected the view that the ‘proposition’ could be a tertium quid, (a representational view that a proposition is something which mediates our description about an actual occurrence), which committed him to the view that there must be, in some sense, an identity between propositions and situations. The difficulty with this view was that while it seems quite natural and correct to assert that true propositions are identical with existing situations, this clearly cannot be the case with false propositions. What situation, it can be asked, is identical with a false proposition? Are we meant to believe that a false proposition is identical with a non-existing situation? Anderson’s own solution to this problem was to assert that the mind ‘mis-takes’ a non-existing situation, for example, the sky being green today, with an actual existing situation viz. that the sky is blue. The plausibility of this reply focuses attention back on the initial criticism. If it is asserted that all situations are identical with proposition, then it is irrelevant to ask for the ontological status of false propositions. Ex hypothesi, they are identical with a situation. On the other hand, if we simply assert that it is only true propositions which are identical with existing situations, then there is no problem with the status of false propositions at all. As Anderson would say, the mind simply ‘mis-takes’ a non-existing situation for an existing one.

c. Realist Ethics

In his Realist ethical theory, Anderson drew a sharp distinction between ethics and morality. Morality is a system of imperatives and obligations which can only be understood relationally. If we assert that ‘X ought or should be done’, then there must always be a subject who is asserting that obligation. The function of moral science on this account is not to establish the absolute or categorical nature of the imperative, but to establish who is asserting such an imperative. In contrast, Anderson conceived of ethics as a science of goodness and badness. In this theory, good and bad are naturally occurring qualities of social and psychological activities. Hence he rejected the relativist view that relations somehow determine the quality of good. It is one of the more unusual features of Anderson’s ethical theory that there is no recommendation or discouragement implicit in the description of an activity as good or bad: we are simply describing an activity in an ordinary scientific sense.

As to the exact nature of the goods and bads themselves, Anderson argued that if we look at the history of ethical and moral philosophy then certain qualities which are consistently described as good, virtuous, or obligatory gives an indication as to the nature of the goods themselves. In his classification of goods and bads, Anderson was much influenced by Sorel’s theory of the producer ethic and the consumer ethic. The producer ethic is one which consumes in order to produce, while the consumer ethic is one which produces in order to consume. This implies that the producer ethic is essentially creative, inquiring, and productive (qualities that are exhibited in Sorel’s classification of social culture of Art, Science, and Industry), while the consumer ethic is imitative, obscurantist, and consumptive. Anderson later included love as the good, and hate as the bad, within the domestic sphere.

Anderson also utilized Socrates’ view that goods support one another but oppose bads, while bads oppose both goods and other bads to assist in his classification process. In other words, goods are essentially supportive, while bads are essentially oppositional. On this basis, Anderson argued that goods are co-operative and communicative, while bads are competitive and uncommunicative. That is, goods, in their relations of support, will seek to work with and communicate with other goods, while bads, in their relations of opposing, will fight against and not communicate with, both bads and goods. Anderson also listed other goods in his ethical theory including a care for exactitude and a rejection of the notion of a reward for doing something, but he never developed a full classification of the various goods and bads. This is one of the major criticisms of his ethical theory. It is one thing to assert that certain qualities are good and bad and that they operate in certain ways; it is quite another thing to actually show that this is a correct and true classification. Anderson never fulfilled this latter task.

i. Proletarianism

Anderson’s social and political theory during the 1930s has been described as a proletarian theory and by this term it is meant that it was of a general Marxist, and a specifically Communist, orientation. [Weblin 2003] Anderson’s Marxism was unique in that while he adhered to an overall materialist outlook such as the class structure of society and the distinction between economic base and social superstructure, he rejected Marx’s historical theory of dialectic. Hence while he agreed that during capitalism the proletariat and the bourgeoisie are engaged in class warfare, he did not think that the proletariat would succeed simply because it was part of a dialectical progression of history. Anderson’s analysis of this social conflict was also more pluralistic than Marx’s as he believed that the proletariat needed to work with artists and intellectuals to achieve social and political revolution. In this respect, the bourgeois origins of certain artists and intellectuals, such as Anderson himself, was irrelevant to the ongoing social conflict.

Apart from the broad proletarian orientation of Anderson’s social and political theory during the 1930s, there were two quite distinct moments in his active political engagement. From 1927 to 1932 Anderson was actively involved in the Communist Party of Australia. At this time, he believed Russian Communism was the pre-eminent model for Communist parties everywhere, although he supported the independent operation of local parties. Initially unaware on the pernicious influence of Stalin in Moscow, Anderson came to see that the Russian party was beset by bureaucracy, censorship, and ideology and his independent stance increasingly bought him into conflict with local members who were more prone to following the Moscow line. Anderson’s writings in defense of Communism during this period generally reflect his belief in determinism, pluralism, and objectivism in social and political activity.

In 1933, he helped form the Trotskyist Workers Party of Australia and remained actively involved for the next four years. His break with Communism in 1933 was occasioned more by his recognition of the corrupt nature of Stalinism, rather than any belief that Communism was inconsistent with his philosophic doctrines. Hence during this period, he retained the belief that Communist theory, untainted by Stalinist practice, was deterministic, pluralistic, and objective, and accepted that Trotskyism provided a viable theoretical and practical alternative to Stalinism. However by the time of his departure on sabbatical in December 1937, he had come to reject Trotskyism as a viable alternative to Stalinism and was questioning whether Marxism was in fact consistent with his Realist philosophy.

d. Realist Aesthetics

In his Realist aesthetic theory, Anderson often criticized aesthetic theories on the basis of either their relativism or their subjectivism. [Anderson 1982] Hence, in criticism of Subjectivism in aesthetics, he argued that if beauty is simply a question of what the subject believes or prefers, then there is nothing that is beautiful in itself. Such a claim is quite simply a denial of the very possibility of aesthetic theory. In contrast, in his criticism of relativist aesthetic theories such as Romanticism or Marxism, he argued that if beauty resides merely in the political context of the aesthetic judgment or the active willing of the aesthetic judgment, then again there can be no objective aesthetic theory. From these arguments, it could be reasonably assumed that Anderson believed that beauty, like goodness, was a quality of natural objects. However Anderson never explicitly stated this view, although he did once assert that beauty is a ‘character’ of natural objects.

Apart from these more formal features of Anderson’s aesthetic theory, there is some indication that he was also developing a more substantive theory of aesthetic damnation and redemption. Several times during the 1930s he quoted Joyce’s expression that ‘history is a nightmare from which I am trying to awake’, and that this subjection of man to history is the state of alienation of the self. Release from this servitude, Anderson suggested, is the affirmation of the human spirit through artistic creation and aesthetic criticism. However, he did not develop these views in detail.

3. Mature Philosophy 1939-1962

After Anderson’s return from sabbatical in 1938, there was a marked reduction in his academic output with only 25% of his entire academic corpus being written over the next twenty three years. While there was a marked change in his political views, the changes in his philosophical views are less detectable. In his writings on ethics, aesthetics, and history, it was not immediately apparent that he was departing from systematic Realism.

a. Ethics

During the war years, whilst Anderson’s ethical writings still stressed the qualitative nature of goodness, there was also an apparent change in emphasis on the concepts of liberty and servility as ethical relations within society. While he rejected an apparent inconsistency in his view that goodness is something which can be pursued, he still emphasized that goodness is a character or quality of specific social activities and cannot be identified with ‘that which is obliged or commanded’. He was also forced to clarify whether such qualities were psychological – for example, creativity and inquiry – or social – for example, co-operation and communication. His belief that goods, such as inquiry, might occur at the intersection of the psychological and social fields, led to speculation that goods might occupy a unique place of ‘psycho-social’ activity. [Eddy 1942] Anderson now stated that goods only occur in ‘causes’ or social movements which themselves strive after freedom. Accordingly, individuals who participate in social movements can be transformed by these movements to such an extent that they ‘transcend’ their self-imposed limitations and become free and creative in the process.

Apart from developing the formal features of his own theory, Anderson criticized both Christianity and Socialism for fostering an ethic of philanthropy. He argued that philanthropy seeks to provide relief to the underprivileged, but that such protection actually weakens the operation of those actual and independent social movements which can provide escape from the servitude of bourgeois society. He argued that such servility is not something that one can be ‘saved from’, as it is only by what men are and not by what they are given that they can win release from servitude. He also criticized Mill’s theory of ethical hedonism, arguing that while pleasure is a quality of natural things and hence could in principle define the nature of goodness, in fact it is too narrow a conception to provide such a definition.

This emphasis on liberty and servility became a more prominent feature of Anderson’s ethical writing and he argued that goods only exist in their struggle with evils. Accordingly, any attempt to abolish evil must also result in the abolition of good. In particular, he asserted that liberty only exists in its struggle with servility, and that the attempt to establish a State where insecurity and insufficiency are abolished is a servile goal and can never succeed.

i. Anti-Proletarianism

After his return from sabbatical, Anderson gradually developed a distinctive theory of liberal democracy. Consistent with his view that liberty only exists in relation to servility, he argued that liberty was not to be found enshrined in the rights, rules, and procedures of the State, but was exemplified in independent opposition to the State. Hence, at the time of the formulation of the Atlantic Charter with its statement of the ‘four freedoms’, he argued there are no rights that can guarantee freedom. A ‘right’, in this view, is only the expression of a certain social ‘might’.

Similarly, Anderson believed that democracy was not a ‘thing’ that is instituted in a polity, for example, representative democracy, but was the balance of diverse social interests, one of which was the State itself. Even though a polity may be nominally called a representative democracy, if the social and political organizations within that polity do not oppose incorporation into the State structure, it cannot truly be a democracy.

After the end of the war, Anderson’s writings on political theory were infrequent, although in the writing that does exist, two themes are dominant. Firstly, he opposed Communism at every opportunity. However, this opposition did not extend to supporting the banning of the Communist Party of Australia and in the 1950 referendum on this issue, he publicly and forcefully argued for the No case. Further, his opposition to the theoretical underpinnings of Communism led him to assert that egalitarianism was ‘the disease of the modern time’. Secondly, Anderson also defended the general features of a conservative theory of society. In particular, he defended the notion that social and cultural traditions have their own ‘rights’ and modes of operation with which the State must not interfere. This was especially the case with universities and academic traditions. It should be stressed that the distinction between Anderson’s democratic and conservative period is not clear cut, although in general terms he referred to his political thinking during this period as ‘anti-proletarian’. [Stavropolous 1992]

b. Aesthetics

After his return from sabbatical, the only writings of Anderson’s that deal expressly with aesthetic theory are his 1942 lectures on ethics and aesthetics. These lectures are a detailed discussion of the concept of beauty understood as either a theme in temporal arts such as music or drama, or as structure in spatial arts such as painting and sculpture. However, while Anderson was undecided on whether theme or structure was the best general description of beauty, he made the remarkable assertion that beauty cannot be a quality. Anderson’s equivocation over describing beauty as a quality during the 1930s has already been noted, but the implications of expressly denying that beauty is a quality cannot be underestimated. He must either assert that beauty is a relation and therefore deny the very possibility of any objective aesthetic theory or he must admit the inconsistency in his position and argue that aesthetics does not form part of his systematic Realist philosophy. In his more substantive aesthetic theory, he argued that man’s estrangement from society is caused by the loss of love between self and others, and that this estrangement can only be overcome by the activity of love. He further argued that the ‘eternal affirmation of the spirit of man in literature’ can be achieved in science as well as art, the difference being in terms of their style of presentation.

c. Ontology

Even though Anderson published very little after 1943, in his lectures he presented details of his ontology which were unknown previously. In his 1944 and 1949 lectures on Alexander’s Space, Time and Deity, Anderson presented his criticism of Alexander’s philosophy of emergent evolutionism. [Anderson 2005, 2007i. The 2005 lectures are in Anderson’s own hand and therefore are more accurate] For Alexander, Space and Time are the point and moment from which the universe begins and from which all things and qualities emerge, culminating, ultimately, in the emergence of Deity. [See entry in IEP on Samuel Alexander] In some places of Space, Time and Deity, Alexander held that Space and Time are created (and hence Super-Substantialist), although in other places he simply refers to Space and Time as the logical conditions of existence. It is this latter understanding of Space and Time that Anderson developed in his own lectures on Alexander.

i. Space-Time

Like Alexander, Anderson rejected both Idealist and physicalist accounts of the nature of Space and Time. He rejected the Idealist claim that Space-Time is simply an aspect of the Absolute and also rejected the physicalist or substantialist theory whereby Space-Time is itself a thing which comes into existence. Anderson argued that if we examine our own experience of Space and Time, we discover that our experience of Space is characterized by one, two, and three dimensionality such that we experience the spatiality of all things in terms of their length, breadth, and height. He similarly argued that our experience of Time is characterized by successiveness, transitiveness, and irreversibility. That is, our experience of Time is characterized by the experience of successive times, that one time follows another, that such times are transitive, that if B follows A, and C follows B, then C must follow A, and that time is irreversible, that we never experience time ‘going backwards’. He argued further that while we can abstractly separate Space and Time to consider their individual characteristics, we must always experience them as unified in Space-Time: there can be no experience of Space which is not also an experience of Time and there can be no experience of Time which is not also an experience of Space. Further, there are no limits to Space-Time insofar as both Space and Time are infinite. Space, in its extension, and Time, in its duration, has no finite beginning or end. So for Anderson, to say ‘a thing exists’ is to say it is an occurrence in Space-Time. He also called an occurrence in Space-Time, a ‘situation’.

ii. The Categories of Existence

For Anderson, a thing, in existing in Space-Time, has certain categorical features. He followed Alexander’s treatment of the categories, albeit in a slightly modified form, and argued there were a total of thirteen categories: Identity, Diversity, Existence, Relation, Universality, Particularity, Number, Order, Quantity, Intensity, Substance, Causality, and Individuality. However, we immediately strike a difficulty in Anderson’s treatment of these categories because, while in the text of his lectures he treats the above thirteen categories separately, when he came to classify them he treated two of them – Universality and Quantity – as having dual senses, and therefore expanded the number of categories to fifteen. Hence in one grouping of the categories he distinguishes between logical or propositional categories, mathematical or quantitative categories, and physical or qualitative categories in the following manner:

Logical: Identity; Diversity; Existence; Relation; Universality (Logical)

Mathematical: Universality (Mathematical); Particularity; Number; Order; Quantity (Mathematical)

Physical: Quantity (Physical); Intensity; Substance; Causality; Individuality (Physical Identity)

Quite clearly, this grouping can only be achieved by treating the categories on Universality and Quantity in the dual manner outlined above. Further, as if this classification was not confusing enough already, he also argued that the final category, Individuality, could be regarded as Physical Identity and thus contrasted with the first category, Identity, which he now described as Abstract Identity. This would imply that Individuality is simply one aspect of Identity and thus would reduce the original thirteen categories down to twelve. The exact number of categories is determined only by the method by which you approach them: thirteen as they appear in the text; fifteen if you treat them as three groups of five; or twelve if you treat Individuality as simply an aspect of Identity.

Regarding the specifics of the categories, Anderson argued Alexander failed to present a principle by which the categories could be ordered and he argued that such a principle could be found in the ‘proposition’. He argued that in the logical or propositional categories, the first category, Identity, is treated as mere abstract identity and is indicated by the subject term of the proposition. In contrast, Diversity is everything that is not Identity and is indicated by the rest of the proposition. The contrast and distinction between Identity and Diversity gives us the category of Existence which is indicated by the copula in the proposition. However to assert one specific existence implies that we must have another distinct existence and hence we develop the category of Relation, of various existences related in Space and Time to each other, and these are indicated by the predicate of the proposition. Finally, in having various existences commonly related, we have the category of Universality in its logical sense as a theory of types.

In the transition to the quantitative or mathematical categories, Universality is now treated in its quantitative sense as the universal quantifier to the proposition, ‘All’. In contrast to the universal quantifier, we have the particular quantifier of the proposition, ‘Some’, and this gives us the category of Particularity. While Anderson often asserted that the universal and particular quantifiers were all that was needed to provide the four logical forms of the proposition, his next category, Number, indicates that Universality and Particularity are simply numerical in that they refer to objects that can be counted within a specific field or situation. Further, the next category, Order, indicates that not only can objects be counted, but that they can also be ordered within a given series as they occur in a given field or situation. The final mathematical category is Quantity and this refers to the fact that any number that occurs along a spatial or temporal continuum is real and hence either rational or irrational.

In the transition to the physical categories, Quantity is now treated in its physical sense as the filling of the spaces and times that mathematical Quantity indicated as only an abstraction. In this sense, physical quantity can be described as ‘matter’, although the better general description is ‘solidity’. On this account, solidity is the ‘space-filling’ that occurs when something is located in Space-Time. The next category, Intensity, is probably the most difficult of all of Anderson’s categories to understand. On the one hand, Intensity refers to the qualities that a thing possesses and if this was all that was meant it would be an unproblematic category. However, Anderson also intended Intensity to refer to any comparative of an object (such as its size) and it is clear that the one category cannot refer to both comparative differences between objects and their actual qualities. The next category is Substance and this refers to the structure of a thing or the internal balance or harmony of the tensions that occur within a thing. As a concrete example of Substance, Anderson argues that the substance or structure of water is H₂0. While this idea is clear enough, the relation of Substance to Intensity remains unclear. The category of Substance leads on to the category of Causation which is not a mere succession of situations, but involves the replacing of one situation with another. The final category is that of Individuality, which is the combination of the quality and quantity of a thing to give us the concrete identity of a thing. As concrete Identity, Individuality can be contrasted with the abstract Identity with which we began.

Some of the key criticisms of Anderson’s theory of the categories have already been noted, although there is one further criticism that goes to the heart of his metaphysical system. For Anderson, any meaningful proposition must have terms that are referential – that is, they refer to actual objects, qualities, and situations. However Anderson insisted on several occasions that the categories cannot be understood as mere things, as they are, by definition, those universal or categorical features which all things possess. The difficulty for Anderson is that since his criteria for intelligible and meaningful discourse is limited to ‘propositions’ which have terms that refer to things, since the categories can never be such terms, then we can never have meaningful or intelligible discourse of the categories. This criticism is often described as ‘the unspeakability of the categories’.

d. Realism versus Idealism

While Anderson’s lectures on Alexander were an important contribution to the development and presentation of an ontology he described as Empiricism, it is important to note that at the very time that these lectures were being given, he appeared to be revising his assessment of the relationship between Realism and Idealism. During the 1930s, Anderson treated Realism and Idealism as logical opposites: one could not be asserted without denying the other and the assertion of Idealism led to certain irresolvable difficulties. It is surprising then that in 1949 in personal correspondence with his colleague Ruth Walker, he stated that his major intellectual problem since childhood appears to have been his ‘idealism’; his inability to accept multiplicity as a feature of the world rather than as something to be overcome or transcended. [Weblin 2005i] Even more significantly, he went on to say that only Walker would appreciate his idealism and see it as a ‘stimulating influence’ and ‘not as mere waste’. Unfortunately, Anderson didn’t go on to elaborate exactly how this ‘idealism’ manifested itself in his philosophy and so we have to simply accept his view that he regarded his philosophy, or at least significant parts of it, as ‘Idealist’. It is also noteworthy that in 1950 he wrote to Walker that he appears to be going ‘more and more Hegelian’ and in 1952 he spoke of his ‘revived Hegelianism’. Again, in neither case did he elaborate on the meaning of these statements and so we must accept his prima facie claim that he now thought of his philosophy in Hegelian terms.

Anderson did not reconsider his views on his systematic Realism until an address on the occasion of his retirement from Sydney University in 1958. [Anderson 1958] Much of this article was a standard defense of his conception of Realism. He argued that Realism denies the privileged position that Idealism had reserved for mind as qualifying all of reality and that there was no special difficulty in showing that there was nothing mental about the logic of relations. He emphasized that the most important advance made by Realism was the movement from the vague notion of ‘the real’ to the spatio-temporality of things as part of a general objective theory of reality. However, in an apparent qualification of his earlier views, he also argued that a common error made by Realists is to mistake the object of Realist attack as Idealism, whereas the real object of criticism is Rationalism and its dualist doctrine of ‘essences’. He also praised Hegel’s doctrine of Objective Mind as an important step towards a general objectivist position. So despite his criticism of the Idealist claim the relations are mental, he thought that Idealists such as Hegel had made important contributions towards Realism and qualified his earlier claim that Idealism is the true object of Realist criticism. When these admissions are coupled with his earlier self-description as an ‘Idealist’ promoting a ‘revived Hegelianism’, it is questionable that he believed Realism was the best overall description for his systematic conception of philosophy.

e. History

During the 1950s Anderson’s main academic interest was the question of history. While he had shown a general interest in questions of history since the 1930s, and during the 1940s had dealt especially with Croce’s writings on history. From 1950 onwards he wrote several academic articles dealing specifically with the subject of history. [Anderson 1954, 1959, 1960] The most noteworthy feature of these writings was their consistency with Anderson’s empiricist ontology. Firstly, Anderson insisted that history operates according to deterministic causal laws. There is no place for ‘free will’ within his theory of history. Secondly, Anderson’s theory of history was objectivist and materialist. That is, there is no place for any peculiar subjective or non-material entities. In this respect, even though Anderson was at this time rejecting Communism and egalitarianism as mere political ideologies, he was defending Marx’s historical materialism as an accurate theoretical account of historical forces. However he did not, as previously noted, adhere to a dialectical materialism. Dialectic, as a historical force, is inconsistent with strict determinism. Further, Anderson’s theory of history was a pluralist theory in that it recognized the complex interplay between psychological and social forces in history. One final feature of Anderson’s theory of history which coalesces with his Empiricist ontology is his emphasis on liberty as a dominant force in the working of historical processes. While this at first appears inconsistent with his denial of free will, Anderson understood liberty to be an objective and determined social force. It is only through the causally determined operation of social movements that liberty can be expressed.

f. Empiricism

In the last decade of his life, Anderson wrote little on his systematic conception of philosophy. However the last article he wrote was titled ‘Empiricism and Logic’ and this use of the term ‘Empiricism’, and the fact that his collection of articles was titled Studies in Empirical Philosophy, gives us a clear indication that Anderson believed that Empiricism was the best name for his overall conception of philosophy. In this article he argued that Empiricism is the doctrine of the continuity of all things; that is, that any thing, in existing, is continuous with all other things by existing in Space-Time and sharing common categories such as Substance, Causality, and Identity. Since these categories are universal features of any existing thing, they cannot themselves be things and can only be understood as formal features of things. Further, these common categorical forms can only be known and expressed in terms of propositional functions such as subject, predicate, and copula and the propositional form ‘S is P’. It is significant that Anderson also argued that the common measure of terrestrial events cannot itself be a thing, for such a common measure could only be something formal, that is, non-terrestrial. The ‘idealism’ in Anderson’s Empirical philosophy is clearly evident in this view that the logical or formal nature of categories and propositions could not be understood in terms of things subject to ordinary empirical experience.

4. Conclusion

For a philosopher who had a significant, albeit indirect, influence on twentieth century Anglo-Saxon philosophy, there is a very little contemporary research into Anderson’s philosophy and a remarkably poor understanding of what that philosophy actually is. The explanation of this lies partly in the character of the man himself. His philosophic style was confined to writing condensed articles for a journal that, at that time, was remote from the main centers of philosophical activity. He never published a philosophic book during his lifetime and therefore never exposed himself to criticism beyond the confines of the Australian philosophical community. He also appears to have been reluctant to publish articles on philosophy during the last fifteen years of his life. Only a small percentage of his total published output was written during that period. However, the most serious criticism that can be directed at him is that he never bothered to develop the positions that he advanced. Whether it was his mature political position, his philosophy of mind, his ethical theory, or his aesthetic theory, Anderson sketched out a position but never provided the details or framework of how that position might be developed further. Be that as it may, the scope and logical rigor of Anderson’s philosophy provides a uniquely systematic alternative to the strictures of twentieth century analytic philosophy.

Nonetheless, the exact nature and name of this systematic philosophy is matter of some debate. The most commonly known title of Realism was most widely used during the 1930s, but after that time Anderson made statements and advanced positions that clearly qualified his acceptance of the suitability of that term as a relevant description of his philosophy. In contrast, after the end of the 1930s Anderson used and discussed the term Empiricism far more widely. Indeed his lectures on Alexander during the 1940s are detailed examinations of the substance of Empiricism itself. It may be thought that such a change is merely a change of a name, although the change from the Realist doctrine of external relations and the consequent distinction between qualities and relations to a theory of spatiotemporality, propositionality, and the categories of existence is clearly more than simply a nominal one.

5. References and Further Reading

a. Primary Sources

Anderson, John (1954) ‘Politics and Morals,’ Australasian Journal of Philosophy 32: 213-22.
Anderson, John (1958) ‘Realism’ The Australian Highway (Journal of the Workers Educational Association, Australia): Sept. pp 53 -56.
Anderson, John (1959)‘The Illusion of the Epoch’ Australasian Journal of Philosophy 37: 156-67.
Anderson, John (1960) ‘Time and Idea’ Australasian Journal of Philosophy 38: 163-72.
Anderson, John (1962) Studies in Empirical Philosophy. Angus and Robertson: Sydney.
- [Collection of Anderson’s metaphysical and ethical articles published posthumously. Includes 15pp Introduction by J.A. Passmore.]
Anderson, John (1980) Education and Inquiry. Edited by Phillips, D. Z. Basil Blackwell: Oxford.
- [Collection of Anderson’s educational articles]
Anderson, John (1982) Art and Reality. Edited by Cullum, Graham and Lycos, Kimon. Hale and Ironmonger: Sydney.
- [Collection of Anderson’s aesthetic and literary criticism articles. Includes 17pp Introduction by G. Cullum and K. Lycos.]
Anderson, John (2003) A Perilous and Fighting Life. Edited by Weblin, Mark. Pluto Press: Sydney
- [Collection of Anderson’s political articles. Includes 12pp Introduction and 10pp Postscript by M. Weblin.]
Anderson, John (2005) Space-Time and the Proposition. Edited by Weblin, Mark. Sydney University Press: Sydney
- [Anderson’s 1944 lectures on Alexander’s Space, Time and Deity. Original notes in own hand. Includes 18pp Introduction by M. Weblin.]
Anderson, John (2007i) Space, Time and the Categories. Edited by Cole, Creagh. Sydney University Press: Sydney
- [Anderson’s 1944 lectures on Alexander’s Space, Time and Deity. Student notes only. Includes 5pp Introduction by D.M. Armstrong.]
Anderson, John (2007ii) Lectures in Political Theory. Edited by Cole, Creagh. Sydney University Press: Sydney
- [Anderson’s 1941 lectures on T.H. Green’s Principles of Political Obligation, 1942 lectures on Political Theory (discussing Bosanquet and Lenin), and 1945 lectures on Socialism. Original notes in own hand. Includes 18pp Introduction by C. Cole.]
Anderson, John (2008i) Lectures in Greek Philosophy. Edited by Cole, Creagh. Sydney University Press: Sydney
- [Anderson’s 1928 lectures on Greek Philosophy. Original notes in own hand. Includes 11pp Introduction by G. Cullum]
Anderson, John (2008ii) Lectures in Modern Philosophy: Hume, Reid, James. Edited by Cole, Creagh. Sydney University Press: Sydney
- [Anderson’s 1932 lectures on Hume and 1935 lectures on Reid and James. Original notes in own hand. Includes 18pp Introduction by C. Cole.]
The John Anderson Archive, the online archive of Anderson’s lectures and articles at the University of Sydney Library (compiled 2006–2010).

b. Secondary Sources

Armstrong, David M., (1977) “On Metaphysics”, Quadrant, 21 (7): 64–69.
- [Outline of Anderson’s metaphysical position.]
Armstrong, David M., (2001) “Interview” Matters of the Mind: Poems, Essays and Interviews in Honour of Leonie Kramer, Edited by Lee Jobling and Catherine Runcie. Sydney: University of Sydney Press: pp. 322-332.
- [Discusses importance of Anderson on Armstrong’s intellectual development.]
Baker, A.J. (1979) Anderson’s Social Philosophy. Angus and Robertson: Sydney.
- [Outline of Anderson’s political theories and development.]
Baker, A.J. (1986) Australian Realism. Cambridge University Press: Cambridge.
- [Critical discussion of Anderson’s systematic Realism]
Birchall, B. (1983) “The Problem of Form” International Studies in Philosophy 15: 15-40.
- [Critical discussion of Anderson’s theory of form.]
Cole, Creagh McLean, (2009) “John Anderson’s Political Thought Revisited”, Australian Journal of Political Science, 44(2): 229–44.
- [Critical discussion of Anderson’s political theory.]
Cole, Creagh McLean, (2010) “The Ethic of the Producers: Sorel, Anderson and Macintyre”, History of Political Thought, 31(1): 155–76.
- [Critical discussion of Anderson’s ethical theory.]
Cole, Creagh McLean, (2012) ‘John Anderson’ Stanford Encyclopedia of Philosophy.
- [Detailed discussion of Anderson’s philosophy.]
Eddy, Harry (1944) “Ethics and Politics” Australasian Journal of Psychology and Philosophy 22: 70-92.
- [Critical discussion of Anderson’s psycho-social conception of good.]
Hibberd, Fiona, (2009) “John Anderson’s Development of (Situational) Realism and its Bearing on Psychology Today”, History of the Human Sciences, 22(4): 63–92.
- [Critical discussion of Anderson’s philosophy in relation to contemporary psychology.]
Kennedy, Brian (1996) A Passion to Oppose. Melbourne University Press: Melbourne.
- [Biography of Anderson’s life focusing on personal, social and political themes.]
Mackie, John L., (1951) “Logic and Professor Anderson”, Australasian Journal of Philosophy, 29(2): 109–113.
- [Reply to Ryle (1950).]
Mackie, John L., (1985) “The Philosophy of John Anderson”, Logic and Knowledge: Selected Papers (Volume I), Oxford: Clarendon Press, pp. 1–20.
- [Detailed exposition of Anderson’s philosophy. Originally published in AJP following Anderson’s death.]
Passmore, John (1969) “Russell and Bradley”. Contemporary Philosophy in Australia. Edited by R. Brown and C. Rollins. London: George Allen & Unwin, 21-30.
- [Critical exposition of Anderson’s pluralism.]
Passmore, John (1997) Memoirs of a Semi-detached Australian. Melbourne: Melbourne University Press.
- [Extensive discussion of Anderson’s influence on Passmore’s philosophical development.]
Ryle, Gilbert, (1950) “Logic and Professor Anderson”, Australasian Journal of Philosophy, 28(3): 137–53.
- [Critical analysis of Anderson’s philosophy.]
Stavropoulos , Pam (1992) “Conservative Radical” The Australian Journal of Anthropology 3: 67-79.
- [Discussion of Anderson’s conservatism.]
Weblin, Mark. (1995) ‘The Place of John Anderson in the History of Philosophy’ unpublished PhD thesis. University of New England, Armidale, N.S.W., Australia
- [Critical and systematic exposition of Anderson’s philosophical development.]
Weblin, Mark (2005) ‘John Anderson’ Dictionary of Twentieth Century British Philosophers. Edited by Stuart Brown. Bloomsbury Academic: London.
- [Discussion of Anderson’s place in twentieth century British philosophy.]
Weblin, Mark (2007) ‘John Anderson on Reid and Scottish Philosophy’ The Monist 90: 310-25.
- [Critical discussion of Anderson’s criticisms of Thomas Reid.]
Weblin, Mark (2010) ‘John Anderson and Idealism’ Biographical Encyclopedia of British Idealism. Edited by Sweet, William. Bloomsbury Academic: London.
- [Discussion of Anderson’s ‘Idealism’.]
Weblin, Mark (2014) “John Anderson Arrives: 1930s” History of Philosophy in Australia and New Zealand. Edited by Oppy, Graham & Trakakis, Nick. Springer: Netherlands pp 55-87.
- [Detailed discussion of Anderson’s philosophical, social, and political development.]

Author Information

Mark Weblin
Email: markweblin@gmail.com
Australia

Constructivism in Metaethics

It is difficult to provide an uncontroversial statement of constructivism in metaethics, since the terms of this doctrine are themselves the focus of philosophical debate. However, this view is now perhaps most commonly understood as a metaphysical thesis concerning how we are to understand the nature of normative facts–that is, facts about what we ought to do. Most broadly, it is the view that the correctness of our judgments about what we ought to do is determined by facts about what we believe, or desire, or choose and not, as realism would have it, by facts about a prior and independent normative reality.

Defenders of constructivism have claimed that it represents a new, free-standing alternative to familiar approaches in metaethics. If they are correct, traditional discussions in metaethics have overlooked an important position, one that is supposed to adequately explain the nature of our ethical thinking and practice while avoiding the kinds of objections that traditional views struggle with. However, in order for this to be the case, constructivism must be characterized more narrowly—as the broad characterization above would appear to be true of a number of well-established positions in metaethics (including response-dependence theories and other forms of subjectivism). What form this narrower characterization should take and whether constructivists can make good on this more ambitious claim remains controversial.

This article starts out in section 1 with a brief account of the origins of contemporary discussions of constructivism. Sections 2 and 3 canvass the main motivations and arguments for constructivism along with the various ways in which the view has been interpreted. Section 4 introduces a serious challenge to the ambitious claim that constructivism represents a new, free-standing approach in metaethics. Section 5 entertains a proposal developed in the 21^st century that this challenge might overlook.

Origins of the View
The Scope and Ambition of Metaethical Constructivism
Motivation and General Argumentative Strategy
Is Constructivism “Free-Standing”?
A Challenge to Traditional Metaethics
References and Further Reading

1. Origins of the View

Contemporary discussion of constructivism in ethics largely originates in the work of John Rawls. Along the way to developing a normative foundation for just political institutions that could be divorced from deep and irreconcilable metaphysical disagreements, Rawls entertained a view that he called Kantian constructivism.

Kantian constructivism holds that moral objectivity is to be understood in terms of a suitably constructed social point of view that all can accept. Apart from the procedure of constructing the principles of justice, there are no moral facts. Whether certain facts are to be recognized as reasons of right and justice, or how much they are to count, can be ascertained only from within the constructivist procedure, that is, from the undertakings of rational agents of construction when suitably represented as free and equal moral persons. (1980: 519)

The parties in the original position do not agree on what the moral facts are, as if there already were such facts. It is not that, being situated impartially, they have a clear and undistorted view of a prior and independent moral order. Rather (for constructivism), there is no such order, and therefore no such facts apart from the procedure of construction as a whole; the facts are identified by the principles that result. (1980: 568)

According to the view Rawls presents here, it is not the case that the moral facts merely coincide with such agreements or that the choice procedure may be used to discover what the relevant standards are. Rather, according to constructivism, moral truths (for example, truths about “right and justice”) are determined by procedures in the sense that the moral standards that fix the relevant class of moral facts are constituted by their emergence from special procedures. It is these facts that make our moral assertions true or false. This is what Rawls means when he says that there are no moral facts apart from the procedure.

Rawls discusses Kantian constructivism across a number of different works. However, his most thorough treatment of the view can be found in the above-quoted Dewey Lectures, collectively titled “Kantian Constructivism in Moral Theory”. Here, like in many of his other works, Rawls is primarily concerned with describing and defending an interpretation of justice as fairness–“the idea that the principles of justice are agreed to in an initial situation that is fair” (1971/1999: 11). Rawls famously argues that facts about justice are fixed by principles that would be agreed to in the original position, a procedure in which free and equal citizens agree to terms of social cooperation under the condition that they do not know facts about who they are or what they deeply care about (what Rawls refers to as “the veil of ignorance”).

Some scholars, including Rawls, see a historical precedent for metaethical constructivism in the work of Immanuel Kant. This is why Rawls and some other metaethical constructivists have described their views as Kantian. However, as a point of historical interpretation this is controversial. More recently, philosophers have developed versions of constructivism that are supposed to find their inspiration in the work of other historical figures. For example, along with the Kantian varieties, one can now find Aristotelian (Lebar 2008), Humean (Street 2008, 2010; Lenman 2010), and Nietzschean (Silk 2014) versions of constructivism represented in the philosophical literature.

Although Rawls is generally credited with introducing constructivism into contemporary metaethical debates, the details of his own presentation have tended to obscure the underlying view. In particular, his focus on the facts of justice, as opposed to moral or normative facts more generally, has led many to interpret him as merely presenting a restricted form of constructivism, one that is compatible with a number of metaethical interpretations (including realism). Moreover, Rawls is clear in later works that political constructivism (the name for his mature interpretation of justice as fairness) is intended as a practical doctrine, not a metaphysical one; it should be interpreted as neutral with respect to deeper philosophical commitments that could potentially be the source of reasonable political disagreements. This has led those interested in the view to look elsewhere for a full-fledged metaethical account of constructivism.

2. The Scope and Ambition of Metaethical Constructivism

As the preceding discussion of Rawls already suggests, there are different ways of interpreting constructivism depending on how one characterizes the scope of the view. And how one characterizes the scope depends in turn on how metaethically ambitious one intends one’s constructivism to be. The following subsection will present Plato’s famous Euthyphro Question as a way of introducing what is perhaps the broadest way of capturing constructivism. Although this broad characterization captures something that is recognizably metaethical, it also ends up capturing too much and, consequently, fails to describe constructivism in a way that would make it a novel and interesting alternative to familiar metaethical positions. The three remaining subsections present narrower conceptions of the view and discuss some of the challenges that defenders of these forms of constructivism face.

As we will see, these narrower versions of constructivism can be distinguished along one of three separate axes. The first axis concerns whether constructivism counts as a first-order account of some particular domain of ethical thinking or a second-order (that is, metaethical) account of the nature of ethical thinking as such. The second axis concerns whether we should expect constructivism to yield convergence on a single class of correct ethical judgments. The third axis concerns whether a constructivist account of the nature of ethical thinking as such counts as a novel, free-standing metaethical interpretation or, rather, as a version of one or another familiar alternatives. Each of these axes will be explored in turn.

a. The Euthyphro Question

Ethical practice includes certain characteristic activities: we value things; we take actions to be wrong, or right, or permissible; we also take some things to count as reasons for acting. One question that metaethics considers is the relation between these activities and the kinds of properties and facts we most generally refer to when using evaluative, moral, and normative language (if indeed we use it to express a state of mind that refers to anything at all).

Although contemporary philosophy has generated a flurry of literature on this topic, the underlying question is a very old one. It is arguably a version of the one that Plato entertains in his presentation of a dialogue between Socrates and his eponymous interlocutor, Euthyphro. There the topic is piety. Specifically, is something pious because the gods love it? Or do the gods love it because it is pious? Plato’s brief arguments in the dialogue have been interpreted in different ways and done little to quiet interest in these questions. In order to broaden the terms of the discussion, we may recast Euthyphro’s question “in rough secular paraphrase” (Street 2010: 370).

Do we value things because they are valuable? Or do things have value because we value them?
Do we take certain actions to be wrong (or right or permissible) because they are wrong (or right or permissible)? Or are these actions wrong (or right or permissible) because we take them to be so?
Do we take certain features of the world to count as reasons for action because they are reasons? Or do these features count as reasons for action because we take them to be so?

These are metaphysical questions. How we answer them will say something about the kinds of facts or properties that exist and what they are like–for example, whether an account of these facts and properties must make essential reference to these activities or the standpoints that characterize them. Constructivists–about (i) value, or (ii) morality, or (iii) reasons–answer “no” to the first of each of these pairs of questions and “yes” to the second. Realists–about (i) value, or (ii) morality, or (iii) reasons–answer “yes” to the first of each of these pairs of question and “no” to the second. In order to simplify things, this article will use “ethical” to refer to the broadest category of practical thinking, one encompassing all three of the categories above. In other words, then, the distinction the Euthyphro question prompts us to consider is one between constructivism and realism in metaethics.

Things are actually more complicated here. Neither constructivism nor realism may be alone in taking these particular positions, depending on how one defines each view. On the one hand, one might object that the distinction is not merely between constructivists and realists but between constructivists and realists or quasi-realists. Defenders of quasi-realism, such as Simon Blackburn, explicitly claim that they side with realists on this issue but, nevertheless, reject the realist’s view about what it is to make an ethical judgment–that is, they reject the realist’s account of the semantics of ethical terms and expressions and its accompanying metaphysical commitments. On the other hand, one may object that a defender of a response-dependence theory or subjectivism, more generally, would respond to these questions in the same way that the constructivist does. Although the Euthyphro question is a helpful point of entry for understanding what is generally at issue in debates between constructivists and realists, the distinction it introduces captures broader families of views under which constructivism and realism fall. Hence, if we are to understand what is special and interesting about constructivism and the challenge it poses to realism, we must introduce a further set of distinctions.

b. Local versus Global

One such further distinction concerns whether we ought to understand constructivism as restricted to a local domain of ethical discourse or whether we ought to take constructivism to provide a global account of the nature of ethics (Enoch 2009). For reasons that will become clear in the course of the discussion, this distinction is also sometimes described in terms of “restricted” versus “thoroughgoing” accounts of constructivism (see Street 2008 for coinage of these terms). According to the former understanding, one could opt to provide a constructivist account of a species of ethical facts–for example, facts about justice–but remain silent as to whether other species of ethical facts–for example, facts about morality or reasons for action–are constructed. According to the latter understanding, constructivism provides the correct account of all ethical facts (including facts about justice, morality, value, and reasons). This distinction matters for two reasons.

First, many early and influential presentations of constructivism in ethics appear to be restricted to local domains of ethical discourse. For example, let us consider Rawls’s justice as fairness and T.M. Scanlon’s (1998) constructivist view of morality, or what we owe to each other. According to the former, the facts of justice are determined by principles agreed to by free and equal citizens who are faced with the task of establishing fair terms of social co-operation. According the latter, the facts about which acts are morally wrong are determined by whether these acts “…would be disallowed by any set of principles for the general regulation of behaviour that no one could reasonably reject as a basis for informed, unforced, general agreement” (Scanlon 1998: 153). These two views count as forms of constructivism insofar as each characterizes a particular class of ethical facts (facts about justice, facts about morality) as a function of our volitional attitudes–that is, facts about what we would will, or choose, or agree to, or reject.

As we have already seen, Rawls would appear to restrict his constructivist treatment to facts about justice. In fact, his mature statement of political constructivism would appear to require this. Rawls argues that a constructivist account of all of morality (or value or reasons for action) would be subject to reasonable disagreement and could not serve as a stable basis for an enduring consensus about what justice requires. Similarly, Scanlon restricts his constructivist account of morality to facts about what is right, wrong, and permissible; it does not extend to all facts about what our reasons for action are.

Second, such local forms of constructivism would appear to be compatible with different metaethical interpretations (including realism) and, hence, would not count as competitors to standard alternatives in metaethics. In order to illustrate this point, let us assume (as many philosophers now do) that the concept of a reason is the basic normative concept which unifies different areas of ethical thinking and practice–that is, different areas of ethical discourse are ultimately concerned with particular classes of practical reasons. Working under such an assumption, one might say that Rawls’s political constructivism grounds reasons of justice in a further set of reasons: namely, those had by free and equal citizens who are faced with the task of establishing fair terms of social co-operation. Similarly, one might say that Scanlon’s moral constructivism grounds moral reasons in further reasons: namely, a reason to live together with others on terms that they could not reasonably reject together with reasons for rejecting proposed sets of moral principles. But importantly, neither Rawls nor Scanlon provides an account of the nature of these grounding reasons. Rawls’s political constructivism does not attempt to explain what it is for free and equal citizens in this situation to have such reasons; Scanlon’s moral constructivism does not explain what it is to have a reason to live together with others on terms that they could not reasonably reject or what it is to have reason to reject a proposed set of moral principles. In principle, one might explain these grounding reasons in non-constructivist terms–for example, in terms of expressivism or even realism. Indeed, Scanlon himself appears to favor a kind of non-naturalist realist interpretation of these more basic reasons. Yet another option would be to explain these grounding reasons themselves in constructivist terms.

A global form of constructivism is one that attempts to explain the nature of reasons as such in constructivist terms. Again, we will follow defenders of this view and assume here that the concept of a reason is the basic normative concept. It follows then, on this view, that all normative facts are constructed. In order for this to be the case, however, constructivism cannot privilege any one class of substantive reasons as the grounds for all the others. For again this would leave open the possibility of explaining these grounding reasons in terms of some other metaethical interpretation. Rather, a global constructivism must be thoroughgoing and apply to all practical reasons; it must avoid making any substantive commitments about the content of our practical reasons.

In other words, constructivism must be characterized formally (Street 2010: 369). This means that what it is to have a reason is not characterized in terms of other kinds of reasons but, rather, in terms of certain formal conditions (for example, consistency and coherence) that one’s attitudes must satisfy. On this view, one has a reason to perform some act just in case the idealized set of attitudes one would have, were these formal conditions to be satisfied, would (in some sense) support performing this act. By characterizing what it is to have a reason in such terms, global (or thoroughgoing) constructivism would at least appear to be incompatible with the kind of objectivity involved in a robust understanding of ethical realism (see section 3a for discussion). However, whether this characterization suffices to distinguish global constructivism from other familiar non-realist alternatives in metaethics is less clear. Before discussing this, one should note two further distinctions that have been made within the class of global constructivist views.

c. Humean versus Kantian

Sharon Street (2010) has observed that global, formally-characterized versions of constructivism may be further divided into those views that guarantee convergence on a single class of substantive reasons for action and those that do not guarantee such convergence. In other words, global constructivists may disagree about whether the formal constraints on one’s attitudes are sufficient for generating a single set of normative truths or whether these constraints may yield different, conflicting sets of normative truths depending on the attitudes one starts out with. The former she calls Kantian constructivism (note that, unlike Rawls, Street restricts the use of this term to forms of global constructivism); the latter Humean constructivism.

According to Humean constructivists–which importantly includes Street (2008, 2010)–formal construction conditions constrain what one’s reasons are but they do not fully determine the content of the reasons that one has; rather, the content of one’s reasons depends to a significant extent on the content of the attitudes that one starts out with. So, for example, two people with radically divergent attitudes (beliefs, desires) to start with are likely going to have divergent sets of reasons for action. In other words, Humean constructivists are open to the possibility of a kind of relativism about practical reasons. This means that, on their view, it might turn out that when the constructivist’s formal conditions are applied to most people’s attitudes, they yield a set of practical reasons that support common-sense views about morality, like the view that we have reason not to torture others for fun. However, on their view, it might also turn out that an ideally-coherent Caligula–that is, someone who values torturing others for fun above all else and whose attitudes satisfy all of the formal constructivist conditions–has absolutely no reason not to torture other people for fun (Street 2009, 2010). Of course, such a view fails to conform extensionally to many people’s considered judgments about the substance of practical reasons. This is considered by some to be a serious theoretical cost to accepting this form of constructivism.

According to Kantian constructivists like Christine Korsgaard (1996), the formal conditions of construction will always generate a single class of normative outcomes, regardless of the attitudes one starts out with–that is, even if one starts out with the non-idealized attitudes of a Caligula (generally considered a sadistic and insane tyrant). In other words, Kantian constructivists think that they can preserve a strong, anti-relativist sense of objectivity for practical reasons without the controversial metaphysical commitments of realism. Moreover, Kantian constructivists argue that the single class of practical reasons generated by constructivism is also the very same class of reasons traditionally supported by “common sense”–that is, one including moral reasons to keep one’s promises, to aid, and to forebear from harming others. However, some philosophers argue that the constructivist cannot guarantee such convergence without smuggling in realist metaphysical commitments; others argue that such a view confuses the theoretical aims of metaethics with those of a first-order theory of practical reason.

d. Familiar versus Novel

Regardless of where one stands on the issue of convergence, there is the further question of whether “global constructivism” is supposed to capture a familiar class of metaethical views or whether it is supposed to refer to a novel and as-of-yet unexplored alternative to these. Some philosophers employ the term “constructivism” to capture a broad family of views, one characterized by their shared response to the Euthyphro question. While this understanding of constructivism might draw our attention to commonalities among these views that normally go unnoticed, this understanding does not offer a novel position. This arguably limits our interest in constructivism. Other philosophers employ the term “constructivism” more narrowly to capture a much more ambitious view, namely, one that counts as a free-standing alternative to existing metaethical alternatives. If they are correct, then traditional discussions in metaethics have overlooked an important position (compare Sayre McCord 1988, Railton 1996), one that is supposed to adequately explain the nature of our ethical thinking and practice while avoiding the kinds of objections that traditional views struggle with.

3. Motivation and General Argumentative Strategy

Since constructivism is often framed in opposition to the realist response to the Euthyphro question it should not be surprising that one standard way of motivating constructivism is to present it as a response to the putative failings of realism in metaethics. But how we are to understand realism or anti-realism in metaethics is itself contested. This of course complicates things. If constructivism is presented as a response to realism but the commitments of realism are themselves contested, it would appear that in beginning with realism we are not beginning with the clearest framework for understanding constructivism. Despite this complication, however, this framework can still be useful. The situation just requires that we be explicit about how we are to understand the term “realism” in this context.

a. Constructivism versus Realism

It is fairly uncontroversial to take realism in metaethics to include at least the following two conditions:

(1) Atomic ethical statements are the kind of things that may be literally true or false.

(2) At least some of them, literally construed, are true.

These conditions look promising insofar as they serve to contrast realism with two commonly recognized non-realist competitors. The first condition contrasts the view with non-cognitivism. Defenders of these views deny that ethical statements are straightforwardly or literally fact-stating; they rather claim that we use ethical language to perform some non-descriptive function or to express some non-representational state of mind. The second condition contrasts the view with an error theory. Defenders of an error theoretic account of morality accept that we use ethical language to report beliefs but claim that all of these beliefs are systematically false because ethical terms and expressions fail to refer to anything. Again, both types of views are framed in opposition to realism. Insofar as (1) and (2) achieve this contrast, they provide a helpful way of understanding realism’s commitments.

These two conditions also appear sufficient to distinguish realism from some statements of constructivism. For example, Christine Korsgaard (2003) has described constructivism in ways that look incompatible with (1). On her account, ethical concepts do not refer to facts that we may come to know and apply in deliberation; rather, they refer to practical problems that agents must solve. The details of this proposal are not completely clear, but some have argued that Korsgaard’s view does not construe moral truth literally. If this were indeed the case, the first two conditions alone would suffice for distinguishing realism from constructivism.

Even if some statements of constructivism might be ruled out by these conditions, however, others are not. In fact, part of what many take to be attractive about constructivism is that it does satisfy these two conditions. By taking on board some of the features of realism and rejecting others, constructivists claim to capture all that is attractive about realism while avoiding standard objections against it. If constructivism failed to satisfy (1) or (2), a defender of the view could not claim any such advantage; for without them there is nothing of realism to speak of. This means that we need to add some other condition(s) to our account of realism, one or another that captures the distinction that these other constructivists have in mind.

Russ Shafer-Landau has proposed the following candidate:

(3) There are moral truths that obtain independently of any preferred perspective, in the sense that the moral standards that fix the moral facts are not made true by virtue of their ratification from within any actual or hypothetical perspective. (2003: 15)

This condition is used to describe what is sometimes referred to as the stance-independence of ethical facts and properties (for coinage of this expression, see Milo 1995). This is because it makes at least some instances of moral truth independent of “any preferred perspective”, actual or hypothetical. A perspective, or standpoint, is a complex system of intentional psychological states–or stances–such as beliefs, desires, commitments, reactive attitudes, and so forth.

According to (3), even if some ethical standards come into existence because they figure as the objects of our desires, or choices, or beliefs, and so forth, there are some that exist independently of our intentional stances. Importantly, this characterization of realism still allows that some of the reasons we have may be the result of choices and agreements we have made. For instance, it seems plausible to think that one can come to have special reasons to do things in virtue of the promises one makes and the specific intentions that this promise-making involves, reasons that one would fail to have in the absence of such promise-making. The realist can allow for this. However, she will insist that not all of the reasons one has are like this; importantly, there are also some we have independently of any of the desires, choices, or beliefs we have or the choices or agreements we have made. For example, she may insist that our reason to uphold the practice of promise-keeping in general is just such a stance-independent reason. In other words, while our reason to make good on a particular promise may depend on our having made this promise in the first place and the specific intentions that this involves, our reason to keep promises more generally would not depend on our having promised to do anything; rather, it is a reason we have that is independent of our attitudes or any perspective or standpoint which they might combine to form. We will soon have occasion to say more about what a perspective or standpoint is. For now, however, it is just important to note how this third condition serves to contrast realism with constructivism.

Unlike the first two conditions, this one does appear to get at the distinction many constructivists have in mind. Specifically, it would appear to give some substance to the intuitive sense of dependence implicit in the Euthyphro questions stated earlier. These questions suggest that constructivism differs from realism insofar as it makes ethical facts and properties depend on our ethical practice in some essential way.

If we accept that realism offers a stance-independent view of ethical facts and properties, constructivism ought to be understood by contrast as a species of a stance-dependent view. On this account, there are no moral, or ethical, truths that obtain entirely independently of any actual or hypothetical perspective. The standards that fix the relevant class of ethical facts are always made true by virtue of their ratification from within some actual or hypothetical perspective. Constructivists offer various characterizations of the relevant status-conferring perspective or standpoint as well as of the kind of ratification that is required. Some of these differences are discussed above in section 2.

Many constructivists accept (1) and (2), but argue that realists go too far in positing (3). This condition constitutes, in large part, the realist notion of ethical objectivity . For this reason, we might take constructivists to be rejecting the idea that ethical facts and properties are objective in the realist’s sense–while leaving open the possibility that they might count as objective in some other sense. Constructivists argue that by incorporating (3) realists fail to accommodate deeply held philosophical and ethical commitments. These failings fall under two broad categories.

b. Metaphysical and Epistemological Objections to Realism

The first supposed failing is that realism cannot accommodate our broader metaphysical and epistemological commitments. Here the concern is generally that realism about value, or morality, or reasons is incompatible with philosophical naturalism. Very roughly, this is the ontological thesis that the only kinds of facts and properties that exist are natural ones—that is, those facts and properties that (could) figure as the objects of investigation of our best scientific practices. The alleged problem is that ethical facts and properties could only satisfy condition (3) if naturalism were false.

There are two (related) versions of this argument in the literature. For example, according to one popular version of the objection made famous by J.L. Mackie (1977), ethical facts and properties exhibit certain necessary connections with our motivational capacities. This view is sometimes referred to as motivational internalism. If these motivational connections are understood naturalistically (for example, as connections between ethical judgments and an agent’s desires or dispositions to choose), it is hard to see how ethical facts and properties could enjoy the independence described in condition (3). They would have to be stance-independent by nature yet necessarily connected with certain motivational stances. The worry is that this would suggest, in the words of Mackie, that ethical facts and properties were “utterly different from anything else in the universe” (1977: 38). The conclusion here is that realism commits one to a kind of metaphysical queerness.

Mackie’s allegation of metaphysical queerness gives rise to a related concern about epistemological queerness. If ethical facts and properties are metaphysically different from anything else in the universe, why should we think that we could discover them in the same way we come to know natural facts and properties (that is, via observation and empirical theorizing)? Here the particular worry is that we could only come to know them via some mysterious faculty of intuition. Hence a queer metaphysics would require a queer epistemology.

While Mackie was the first to present these objections, there are also more recent versions of this kind of naturalistic argument–ones that respond to Mackie’s worries about queerness with a constructivist solution. For example, Street (2006) claims that realism is incompatible with our best evolutionary account of how we came to make the ethical judgments we do. According to this argument, if realism were true we would have no good explanation of how our ethical judgments have succeeded in matching (or “tracking”) stance-independent ethical truths; rather, the truth of these judgments would have to be entirely a matter of unlikely coincidence.

Constructivism, by contrast, is supposed to avoid these problems. By grounding ethical truths in features of intentional states, constructivists claim that their view makes use of only naturalistic materials, ones that can be accounted for by empirical psychology. These are features that may be appealed to in order to explain the apparent connection between ethical judgments and motivation. They might also help the constructivist avoid Street’s skeptical scenario. This is because the constructivist will argue that there is no serious gap between ethical judgment and truth that the skeptic may exploit.

Of course, these types of naturalistic concern alone do little to distinguish the constructivist challenge from others, such as the challenge error-theorists and expressivists mount against realism. In fact, it would appear as if every major challenge to realism incorporates some version of this worry. But this is not the only motivation to which constructivists appeal. This first type of concern is usually coupled with a second type.

c. Objections to Realism from within Ethical Theory

The second type of objection concerns realism’s failure to accommodate purportedly deep features of our first-order ethical thinking. These include many people’s commitment to the idea that there is a necessary connection between moral or evaluative judgments and reasons for action as well as the idea that autonomy is an essential and ineliminable aspect of our practical thinking. Unlike the first type of objection, which appeals to one’s broader philosophical commitments in metaphysics and epistemology, this objection comes from within ethical theory itself.

Many moral philosophers maintain, or are at least attracted to, a view called moral rationalism. According to this view an action’s rightness necessarily provides some reason to perform it; alternatively, an object’s or state-of-affair’s goodness necessarily provides some reason to pursue or promote it. These kinds of considerations arguably come from within first-order ethical theory, since the relata (rightness or goodness, on the one hand; reasons for action, on the other) are the proper objects of first-order ethical investigation.

Although rationalism per se is not incompatible with ethical realism–indeed defenders of a robust form of non-naturalist ethical realism also accept something like this view, it does pose problems for realism when it is combined with a non-realist account of practical reason. That is, in order to secure a non-realist conclusion, constructivists must combine an appeal to rationalism with a rejection of realism about practical reason. On this view, moral facts about rightness or evaluative facts about goodness necessarily entail certain reasons for action, but these reasons are not to be understood as objective, stance-independent facts that we come to know through ethical inquiry; rather, these reasons are determined (that is, “constructed”) by agents engaged in the activity of practical reasoning. For an example of this kind of view see Korsgaard (1996).

This rejection of realism about practical reasons together with a commitment to rationalism puts pressure on one to accept a stance-dependent account of morality and value, as well. For it is difficult to see why there should be a necessary connection between such different types of things (compare Shafer-Landau 2003: 48-49). Why should one think that the realist’s objective, stance-independent moral or evaluative facts necessarily correlate with the results of the stance-dependent outcomes of practical reasoning?

The constructivist rejection of realism about practical reason in turn either rests on an appeal to broader metaphysical or epistemological commitments, like naturalism (which the aforementioned realists reject), or other deeply entrenched first-order ethical convictions, like the importance of autonomy for rational agency (which arguably a realist should also be concerned to preserve).

As with the argument from rationalism, the argument from autonomy again appeals to a commonly-held commitment within ethical theory. The claim in this case is that autonomy is an essential and ineliminable feature of practical agency and that such autonomy requires a kind of control that is at odds with a realist account of practical reason (Korsgaard 1996). According to this argument, autonomous practical agency involves self-legislation–in the sense that practical agents are both the authors of the content of practical laws and the authors of their own obligation to uphold these laws. Constructivism in metaethics is supposed to be fully compatible with such a view of autonomous practical agency. But if a robust, stance-independent realism were true, we could not be said to legislate the content of practical laws or our own obligation to uphold these laws. Therefore, autonomy favors thinking that constructivism in metaethics is correct and that realism is mistaken. For criticism of this argument, see Shafer-Landau (2003) and Robert Stern (2012).

Defenders of constructivism disagree about which of these considerations (naturalism or rationalism or autonomy) poses the strongest challenge to realism. However, as these brief remarks here already illustrate, the arguments they advance conform to a general strategy: first, there is an appeal to one or another of these considerations; second, realism is presented as failing to adequately account for this particular consideration; finally, constructivism is presented as possessing of superior resources for explaining it. In short, the constructivist claims a series of explanatory advantages over realism. Although one might reject one or another of the constructivist’s arguments, the constructivist contends that her view wins out on holistic grounds. Even if realism can accommodate some of these considerations to some extent, the constructivist argues that her view does a better job on the whole. In other words, the constructivist claims to get you everything the realist wants (and more) without any of the problems that realism supposedly introduces.

4. Is Constructivism “Free-Standing”?

Regardless of whether constructivism succeeds in the above arguments, one might still worry that “constructivism” is merely a new label for another well-established view, one whose virtues and defects have already received much attention. Here we return to one of the issues about scope that we encountered earlier. One traditional worry is that much of what has been said thus far about constructivism appears as if it applies to a family of views that are sometimes referred to as response-dependence theories. Subsequently, both constructivists and non-constructivists alike have questioned constructivism’s cognitivist credentials and pressed for details as to how such a view might contrast with expressivism. Furthermore, one might worry that constructivists only succeed at distinguishing their view from expressivism or response-dependence theories to the extent that they construe it as a form of simple subjectivism, a naturalistically reductive view that makes ethical facts or properties a function of an agent’s desires. How is ethical constructivism different from such views? Is constructivism a distinct alternative to response-dependence theories, expressivism, and simple subjectivism? Or does it perhaps represent a species of one of them? Alternatively, might we take constructivism to be the family or genus under which these other views fall?

If constructivism merely turned out to be a version of one of these other views (either species or genus), this would arguably detract from its importance. Part of what is supposed to make the constructivist challenge to realism an interesting one is that it has not received the attention it deserves. This would arguably not be true if it turned out to be a version of a response-dependence theory, or expressivism, or simple subjectivism. Although questions remain as to how we are best to understand the commitments of these other views, they have each commanded a lot of discussion already. In light of this, the strengths and vulnerabilities of each type of view are fairly well established. Moreover, it is difficult to see how constructivism might serve as an improvement on any of these familiar positions–since some of the most compelling objections to each, respectively, are general enough that they would appear to extend to any of their species, constructivist or otherwise. The Frege-Geach problem for expressivism and extensional worries about response-dependence theories (each discussed below) are arguably examples of this. For this reason, it is important to sketch out a sense in which constructivism might count as a free-standing metaethical alternative to these views–that is, a sense in which constructivism counts as a genuinely novel metaethical position and not just another way of describing a more familiar view.

a. The Proceduralist Characterization

Perhaps due to the influence of Rawls, constructivism has typically been understood as the view that ethical truth is determined by the outcomes of procedures. The proceduralist characterization of constructivism has been accepted both by advocates like Milo (1995), Korsgaard (1996) and Street (2008) and by critics like Darwall et al. (1992), Enoch (2009), Ridge (2012) and Copp (2013).

This has led many to doubt whether constructivism represents a free-standing metaethical position. A proceduralist characterization of constructivism easily lends itself to formulation in terms of a familiar family of stance-dependent views in metaethics: response-dependence theories. As such, constructivism would appear to represent a species of such views and, consequently, be subject to the same objections.

Response-dependent views have been described in different ways. Some have defended a response-dependence view of ethical properties. We might take such views to provide the following schematic account of the essence of ethical properties (compare Johnston 1989):

x is C iff (and because) x is such as to produce R in Ss under conditions K

Here C stands in for some ethical property (for example, the property of goodness, wrongness, being a reason), S the subject, K the relevant conditions, and R the response. In order for this equation to pick out a response-dependent property, there are other conditions that must also obtain. For example, the biconditional cannot obtain trivially in virtue of S, K, and R specifying a “whatever it takes” condition. But, side stepping the controversies involved in specifying such conditions, it is already apparent why one might take constructivism to represent a species of such views. The proceduralist characterization lends itself to formulation in terms of subjects, responses, and conditions. For example, Rawls’s statement of a constructivist view of justice might be made to fit the response-dependence schema in the following way.

A policy/institution, x, is just iff (and because) x conforms to principles that would be agreed to by free and equal citizens under the conditions represented in the original position.

Here, C is the property of justness. R is the disposition to produce agreement, a volitional response. S is a free and equal citizen of a society. K is the conditions described in the original position. There may be renderings of Kantian constructivism about justice that better capture Rawls’s view. The point here is just to illustrate how a view that focuses on procedures lends itself to an interpretation in response-dependence terms. In fact Rawls’s statement of constructivism is not unique. The constructivist proposals of T.M. Scanlon and Christine Korsgaard would also appear amenable to a response-dependence formulation.

But if ethical constructivism is best understood as a response-dependence view, as these examples suggest, it would not represent a free-standing metaethical view. Furthermore, it would appear to be subject to the same objections that have been levelled against these views. Different philosophers have recognized a dilemma for response-dependence theories (Blackburn 1993; Darwall et al. 1992), as follows.

We may understand a response-dependence account as providing either a non-reductive or a reductive account of some class of ethical properties. If it is non-reductive (that is, it includes some ethical terms on the right-hand side of the equation), the account will leave traditional metaethical questions unresolved. Specifically, it will not tell us how we are to understand the ethical terms employed in the account, leaving open the possibility that they may take a realist, or expressivist, or error-theoretic, and so forth, interpretation. If it is reductive (that is, it includes no ethical terms on the right-hand side), the account runs the risk of being extensionally inadequate. In other words, the account cannot guarantee that the outcomes will match our considered ethical convictions. In the case of constructivism, this might be expressed as the worry that the subjects of the relevant procedures, actual or hypothetical, might get things wrong. A second related worry concerns the intensional adequacy of such views. Here the objection is that reductive accounts would make the ethical facts “hostage to the outcome of irrelevant causal processes” (Street 2010: 374), irrelevant because our ethical judgments themselves might not appear to be about such agents and their judgments.

A defender of the proceduralist characterization might argue that constructivism provides some new way of navigating around this dilemma. But it difficult to see how it could. The features of a response-dependence view that make it vulnerable to the dilemma are general and appear, as such, to apply equally to a proceduralist interpretation of constructivism.

A more promising response, perhaps, would be for the constructivist to accept one of the horns but argue that the associated objection is not as bad as one might think. But this type of response is also available to defenders of response-dependence views more generally. It still would not provide us with any reason for thinking that constructivism was a free-standing metaethical view; rather, it would appear as if constructivism and response-dependence theories (in general) stand or fall together. If constructivism is to represent a free-standing view it cannot be construed in terms of procedures. This has led defenders of the view to reject the proceduralist characterization and emphasize the ways in which constructivism differs from response-dependence theories.

b. The Standpoint Characterization

Sharon Street (2010) has argued that constructivism is best understood as a distinct and superior alternative to response-dependence views in metaethics. She describes metaethical constructivism as the view that

…the truth of a normative claim consists in that claim’s being entailed from within the practical point of view, where the practical point of view is given a formal characterization. (2010: 369)

Street’s account explicitly avoids any characterization of constructivism in terms of procedures. Again, if constructivism specifies a procedure, this leaves the view open to the dilemma just sketched. From this one might infer that the solution is to avoid talk of procedures (compare also James 2007). So the constructivist retreats from talk of procedural outcomes to what is alleged to be the less problematic, talk of points of view, or standpoints.

Street illustrates the difference by appeal to an example about baseball and how response-dependence and constructivist views would differ in their response to the question What is it for Jeter to be safe? Whereas the former would state the conditions for Jeter’s being safe (a normative fact in baseball) in terms of the responses of an umpire (a good umpire would judge him to be safe), the latter states the conditions in terms of what would be entailed by the rules of baseball in combination with the non-normative facts. Street argues that this formally-construed constructivist alternative is immune to the standard objections that befall reductive response-dependence accounts. On the one hand, it yields the right results; it is extensionally adequate. Unlike the response-dependence view it leaves no room for errors of judgment. On the other, it would appear to capture the sense of what it is for Jeter to be safe. That is to say, it is also, on Street’s view, intensionally adequate.

But what is a standpoint? Although many philosophers appeal to standpoints (especially those working in the Kantian tradition), there is very little detailed discussion of what they are. For example, Street describes the practical point of view as

…the point of view occupied by any creature who takes at least some things in the world to be good or bad, better or worse, required or optional, worthy or worthless, and so on–the standpoint of a being who judges, whether at a reflective or unreflective level that some things call for, demand, or provide reasons for others. (2010: 366)

This description presupposes that we already start out with some sense of what such a standpoint is. Other descriptions of standpoints appeal to metaphor or invoke a distinction between the practical and the theoretical, each of which is supposed to represent a distinct and familiar way of experiencing the world (Wallace 2008).

Let us assume that a standpoint is constituted by a complex system of stances (that is, psychological states that we bear towards things and that, in virtue of being directed towards these things, have a kind of content) such as beliefs, desires, commitments, reactive attitudes, and so forth. In other words, it is a set of individual stances that hang together in a certain way. This much would appear safe if we are correct in understanding the distinction between realism and constructivism in terms of stance-dependence. But if we are to avoid the response-dependence dilemma, it is important that these stances not be described as issuing from any particular type of subject under specific conditions. One alternative would be to focus on the kinds of activities associated with various practical standpoints. Here the idea is that we first look to those familiar activities of, for example, valuing, taking something to be wrong, taking something to be a reason for acting, in order to identify the relevant practical standpoint and then ask what it is to engage in these activities as such.

Korsgaard (2003), James (2007) and Street (2010) describe the constructivist project as one of working out the “constitutive commitments” of various practical standpoints. This involves, amongst other things, the task of specifying the various ways in which particular types of stances must hang together so that one may count as genuinely engaging in a particular practical activity as such.

For example, stances can presumably either support or conflict with one another. Conflicting stances are ones that are in some sense inconsistent with each other. Although Katie may consistently take herself to have both some reason to attend the concert and some reason not to, she may not consistently take herself to have both an all-things-considered reason to attend and an all-things-considered reason not to. Someone who simultaneously maintains these stances arguably fails to meet the basic requirements for taking something to be an all-things-considered reason for acting. This kind of conflict illustrates perhaps the most straightforward sense in which practical stances may count as inconsistent. But there are other ways, too.

Consider someone who takes herself to have an all-things-considered reason to save her life, believes that in order to do so it is necessary that she see a doctor immediately, but takes herself to have no reason whatsoever to see a doctor (compare Street 2008: 227-228). These stances are also in some sense inconsistent with one another. As in the earlier example, someone who simultaneously maintains these stances arguably fails to meet the basic requirements for taking something to be an all-things-considered reason for acting.

A standpoint constructivist might claim that this is because the activity of taking-something-to-be-a-reason-for-acting as such is in part constituted by a norm of instrumental consistency, one that requires that one take oneself to have at least some reason to take the necessary means to one’s ends. As both Street and Korsgaard point out, someone who fails in these ways is not making a mistake about what her reasons are; rather, she does not genuinely count as taking herself to have an all-things-considered reason at all. But consistency is not the only kind of relation that one might take to matter.

Those stances that do not conflict may stand in various degrees of support to each other. Among a set of mutually consistent stances, some will be more central to a particular standpoint, others more peripheral. The extent to which stances on balance exhibit support of one another is a measure of their coherence. By comparison with consistency, it is more difficult to say how coherence might figure as a constitutive requirement for a particular standpoint. Presumably, one either does or does not count as occupying a standpoint; it is an all or nothing affair. But coherence comes in degrees. Surely one may count as genuinely engaging various practical standpoints even if the relations amongst one’s set of stances falls short of maximal coherence. But what if they fell short of minimal coherence?

Coherence might figure as a threshold requirement. Consider someone who is deliberating about whether to attend a party. She considers who will be there, whether there will be dancing, how she will feel the next morning, how this might affect her work schedule, and so forth. After much reflection she concludes “I have an all-things-considered reason to book a flight to Tokyo!” Although there is nothing apparently inconsistent with her taking up this stance, it does not mesh in any way (let us assume) with the considerations she has been entertaining. It fails to cohere with them in any obvious way. Someone who takes up this stance in the present situation arguably fails to count as taking herself to have an all-things-considered reason. Here we might say that the relation this stance bears to the background of the agent’s other stances exhibits a degree of coherence that falls below the threshold that is constitutively required to be counted as engaging in the activity as such.

Coherence might not be the only relation that matters in this way. There may turn out to be different constitutive norms whose satisfaction contributes to a standpoint’s overall coherence. Part of the constructivist project will involve describing what other types of norms or relations are constitutive of various practical standpoints. For example, James (2007) has claimed that the standpoint of practical reason is constituted by certain general norms which determine, among other things, which facts one should attend to in deliberation (a “norm of attention direction”), which to disregard (a “norm of disregard”), which to count as favoring a particular response (a “norm of favoring”), and which to assign more or less importance (a “norm of balancing”). Like the norm of coherence, but unlike the norm of instrumental consistency, these norms would also appear to allow satisfaction to various degrees.

Once the constructivist has an account of these various constitutive commitments in hand, she can then appeal to them to explain the truth or falsity of a particular ethical judgment. According to the standpoint characterization of constructivism, the truth of an ethical judgment is a function of what follows from within a particular practical standpoint. In other words, the truth of a particular ethical judgment is always to be understood as relative to the various relations and norms that constitute a particular practical standpoint. Given a particular collection of stances and the norms governing a particular standpoint, certain stances will follow, others will be ruled out, and still others will enjoy some degree of support but fall short of being “entailed” from within a particular ethical standpoint. This brief sketch leaves many questions unanswered. But it should already allow us to see how constructivists, like Street, might appeal to a standpoint characterization to distinguish their views from the family of response-dependence theories.

Importantly, our judgments about a particular practical standpoint are not about how some subject–actual or hypothetical–would respond; nor are they about how such agents ought to respond. The dilemma for response-dependence views is a result of the way in which these views describe ethical truths indirectly. On these views, the truth of an ethical judgment is described in terms of how some “suitably represented” agent would respond. This allows for a gap between our intuitive ethical judgments and our judgments about how such agents would respond. On the one hand, the agent could get things wrong; on the other, our ethical judgments themselves (the direct ones) might not appear to be about such agents and their judgments. The move to standpoints is supposed to close this gap.

According to the standpoint characterization of constructivism, an ethical judgment is not about what some suitably represented agent would judge or choose. Rather, it is about what follows from within a particular ethical standpoint. In contrast with response-dependence views, the standpoint does not represent an agent. Hence, there is no one who could get things wrong. Moreover, such a standpoint does not reduce ethical judgments to anything else; this is supposed to make it immune to “open question” worries about the account’s intensional adequacy.

One might worry that the extent to which the standpoint characterization succeeds in distinguishing constructivism from response-dependence views is also the extent to which the view starts to look like other existing metaethical alternatives: realism, expressivism, or a simple subjectivism. Suppose that what distinguishes constructivism from response-dependence views is that on the former view ethical judgments are about what follows from within a particular standpoint and not how a certain type of subject would respond under suitable conditions. If this is the case, it is clear how constructivism might count as a version of cognitivism. Ethical judgments are a species of belief, ones that report facts about what follows from within a particular ethical standpoint. But how are we to understand the nature of the “primitive” stances that make up these standpoints? So far, all that we have been told is that they are not about a suitably represented subject’s responses.

c. Ridge’s Argument by Elimination

Michael Ridge (2012) presents a serious challenge to recent efforts to find a plausible, novel, and free-standing version of constructivism in metaethics–one that is especially troublesome for defenders of the standpoint characterization. His argument is one that proceeds by elimination. Ridge exhaustively lists the various ways in which we might understand an agent’s primitive ethical stances and then argues that the forms of constructivism each generates fail to constitute a novel or free-standing alternative to familiar metaethical positions. Ridge counts five options. Due to restrictions of space, however, the following presentation takes a more schematic approach.

According to the standpoint view, ethical judgments are two-tiered. At the second tier, ethical judgments express beliefs about what follows from within a particular ethical standpoint. But how are we to understand the “primitive” first tier judgments or stances–the ones that make up a standpoint? What are they? Do they have their own representational content or are they non-representational states of mind? These questions are crucial for determining whether constructivism represents a free-standing view. Yet it is not clear that the standpoint constructivist can answer them in a way that succeeds in distinguishing her view from familiar alternatives.

Suppose that the first-tier stances are a species of belief. That is, they have representational content. What do these first-tier stances purportedly represent? There would only appear to be two options. Either they represent features of the world that are independent of these beliefs, or they represent one’s other first-tier beliefs. But neither of these options is a promising way of establishing constructivism as a free-standing metaethical position.

On the one hand, if the first-tier beliefs represent features of the world that obtain independently of our stances (that is, stance-independent ethical facts), the view just turns out to be a version of realism. For, in this case, there are at least some ethical judgments whose truth does not ultimately depend on the relations they bear to other stances within a particular ethical standpoint; rather, some will ultimately depend on the relation they bear to the world. On the other hand, if they represent one or another of the other first-tier beliefs that constitute the standpoint, the view may avoid realism. But, in this case, the worry is that this makes the view either vacuous or objectionably circular.

So far, we have been supposing that the first-tier stances are something like beliefs. But, what if they have some kind of non-representational content, how do we understand the states of mind that these basic ethical stances embody?

Suppose that the first-tier stances embody some non-representational state of mind. Considering some of the motivations to which constructivists appeal, one might assume that these basic stances embody a form of pro-attitude, like desires. But if one is able to work out the constitutive relations amongst this class of non-representational pro-attitudes, one will have arguably succeeded in one of the projects central to expressivism.

One of the big challenges for expressivists, the so-called “Frege-Geach problem”, is to explain how ethical discourse exhibits standard logical inference patterns despite the fact that ethical language is not fact-stating, on their view. Standard logical inference is truth preserving. But expressivists do not think that ethical language is used to express truth-evaluable content. So expressivists must come up with an alternative “practical logic” which shows that it is nonetheless legitimate to use ethical language in these ways. Expressivists have offered different proposals, but they remain extremely controversial.

Constructivism is usually understood as a version of cognitivism. This is because it makes ethical judgment, at some level, a species of belief. In particular, the standpoint constructivist understands the second-tier ethical stances this way. They are beliefs about what follows from within the various practical standpoints. They have truth-evaluable content and, as such, should figure unproblematically in contexts which require such content (belief reports, truth-preserving inferences, and so forth). This might create the impression that constructivism will be immune to the kinds of objections that expressivists face. But if the stances that constitute these standpoints are themselves non-representational, it would appear that constructivism is saddled with the same project as expressivism, at least at the level of the first-tier ethical stances.

If certain non-representational stances are supposed to follow from within a particular standpoint constituted by other non-representational stances, we must suppose that the relations amongst these stances constitute a structure with its own non-truth-preserving “practical logic” (compare Gibbard 1997). It is only once this logic is worked out that a constructivist will be able to say whether a particular second-order stance–in this case a belief–is true or not. If this is indeed the way to understand the standpoint constructivist’s apparatus, one might object that the view does no better, or even perhaps worse than expressivism. In fact, an expressivist might argue that the constructivist ought to avoid the complication of an additional tier of judgments and simply abandon cognitivism altogether; instead, she should take ethical statements to directly express the non-representational states of mind that constitute the various ethical standpoints. This would in effect make the view a species of expressivism.

This is a line that become increasing popular in the early 21^st century. Several defenders of expressivism have argued that constructivism is best understood as a species of, or supplement to, their view–see Gibbard (1997), Lenman (2010), Ridge (2012). It has even been encouraged by some constructivists–including Korsgaard (2003) and, to a lesser extent, Street (2008, 2010).

But if constructivism is indeed best understood this way it arguably loses in importance as a challenger to realism. The expressivist challenge to realism is both familiar and fairly well understood. Moreover, a constructivist-expressivism would arguably present a weaker challenge than the existing “quasi-realist” versions advanced by expressivists like Blackburn. Not only would a defender of constructivist-expressivism be giving up on cognitivism, she would also be giving up on even the appearance that ethical discourse is stance-independent. Quasi-realist expressivists are at least concerned to accommodate realist ways of talking and thinking about objectivity in ethics. Recall that one of the motivations for constructivism is that the view is purportedly able to secure everything the realist wants without the problems that realism allegedly introduces. But someone who defends this kind of constructivism arguably fails to secure anything the realist wants. Although such a view might represent an interesting internal challenge for expressivists, it would not appear to present a novel or especially plausible challenge to realism. But the only apparent way around this objection risks making constructivism into a version of another well-known metaethical position.

Any version of constructivism that characterizes ethical standpoints in terms of non-representational stances will have a problem distinguishing itself as a free-standing view. In order to see why, let us suppose that the constructivist denies that an ethical standpoint involves a level of structure that would require a practical logic. This would position the view closer to a simple form of subjectivism– for example, a view that takes moral judgments to express beliefs about which acts would maximally satisfy an agent’s actual desires. In this case, one might also describe this form of subjectivism as providing a two-tiered account of ethical judgments. At the first-tier there are desires, a kind of non-representational stance; at the second-tier there are beliefs about these desires. Furthermore, one might argue that such a view, like a standpoint constructivism, is distinct from the family of response-dependence theories.

According to simple subjectivism, ethical judgment is not about how certain subjects would respond; rather, it is about whether an action satisfies one’s actual desires, and how many or, alternatively, how strong these are. Importantly, however, the subjectivist does not take the second-tier judgments to be about what follows amongst an agent’s desires; rather, they express beliefs about how many desires on balance would be satisfied, or frustrated, by a particular course of action. Such a view arguably does not require any logic. Although an agent’s set of desires may exhibit some structure–for example, with some desires taking other desires as their objects, or some desires being more general and others specifications of them–it does not involve a level of sophistication that would support “entailment” claims. Consequently, the challenges associated with expressivism do not arise.

Simple subjectivism provides a model for how a constructivist might avoid the kind of difficulties associated with the expressivist’s project. Of course, the problem with this approach is that it requires that the constructivist explain how her view represents a novel and interesting advance on common versions of simple subjectivism. One might have thought that the extent to which constructivism represents an improvement on such theories is the extent to which the view incorporates structure at the level of first-tier ethical stances (compare Street 2008: 230-1).

There would appear to be a dilemma for a constructivist who insists that first-tier stances are non-representational. Either these stances combine to create a structure within which some stances may be said to follow from others, in which case the view involves the same difficulties that expressivism does. Or a standpoint is to be understood as a mere collection of stances without any sophisticated structure, in which case the view starts to look like a version of simple subjectivism, with all the virtues and vices that such views carry.

The prospect of a freestanding metaethical constructivism is looking dim. The standard proceduralist characterizations give the appearance that constructivism is best understood as a version of a response-dependence theory. This might make the standpoint characterization appear more promising. But if an ethical standpoint is constituted by beliefs, constructivism either folds into realism or turns out to be vacuous. If it is constituted by non-representational stances, it is best interpreted as a species of, or supplement to, expressivism or a version of a simple subjectivism. Unless constructivism can be shown to represent an advance on one of these alternatives, the view would appear to lack in motivation. Nothing that has been said thus far rules out this possibility. But even if constructivism represented an improvement on response-dependence theories or expressivism or simple subjectivism, it may still fail to represent any new or interesting challenge to realism. One would have to show that constructivism improves on these other views in ways that also make for a more formidable challenge to realism. Perhaps the more promising option would be to give up on the claim that constructivism represents a free-standing metaethical view and argue that the challenge presented by constructivism is of a different sort.

5. A Challenge to Traditional Metaethics

One such recent proposal locates what is novel and important about the constructivist challenge in the Kantian distinction between practical and theoretical reason. Defenders of this interpretation of constructivism in metaethics include Christine Korsgaard (2003), Stephen Engstrom (2013), and Carla Bagnoli (2013). They argue that traditional approaches in metaethics understand practical reasoning as a kind of “applied” theoretical reasoning but that this fails to appreciate the distinctively practical nature of deliberation and choice.

Traditional debates in metaethics typically recognize a number of platitudes about the nature of moral discourse and experience. Two of the most prominent of these include the common ideas (i) that there are correct answers to normative questions and that these correct answers are made correct by objective normative facts and (ii) that our judgments about the correct answers to normative questions are themselves sufficient to motivate us to act under normal circumstances (Smith 1994: 6-7).

However, traditional approaches in metaethics have struggled to adequately account for both of these features in a non-mysterious way. This, again, is the problem of “queerness” that Mackie identified and that we have discussed above in section 3b. As a result, we find some traditional views in metaethics that emphasize the objectivity of ethics but sacrifice or downplay the connection between normative judgment and motivation; others emphasize the motivational connection but sacrifice or downplay the objectivity of ethics.

The proposed constructivist response to this dilemma involves rejecting the underlying understanding of practical reasoning as a kind of “applied” theoretical reasoning, together with its “ontological” conception of objectivity in ethics. Theoretical reasoning attempts to acquire knowledge of facts that exist prior to and independent of its activity. It is the independent existence of these facts that makes theoretical reasoning objective. Bagnoli describes this as an ontological conception of objectivity because objectivity here depends on the existence of a special class of facts. But she and other constructivist critics argue that the same feature which constitutes the objectivity of theoretical reasoning also makes this kind of objectivity unsuitable for practical reasoning. For practical reasoning is ultimately concerned with deliberation and action; its verdicts carry a special authority that has the force, under normal conditions, to move us to act.

According to the constructivist critique, traditional approaches in metaethics conceive of practical reasoning as an attempt to gain theoretical knowledge of prior and independent normative facts and then apply this knowledge in deliberation. But it is precisely this theoretical conception of objectivity as “external” that would appear to be at odds with the “internal” authority and motivational power of practical judgments. It is difficult to see why an ontology of prior and independent normative facts should generate reasons that are authoritative and efficacious from within the practical standpoint of someone deciding what to do. As Korsgaard (1996, 2003) argues, one can always ask why one has reason to do what the prior and independent normative facts indicate that one should do. If this is the case, one might think that the underlying account of practical reason has failed to do its job.

In contrast to the ontological conception of objectivity, Korsgaard, Engstrom, and Bagnoli each argue in their own way for a “practical” conception of objectivity in ethics. They agree that a successful account of practical reason will be one that explains the normative authority and motivational force of practical judgments in terms of the special objectivity that is constitutive of the activity of practical reasoning itself. These constructivists follow Kant in characterizing ethical objectivity in terms of the constitutive conditions of rational agency and the specific form of practical reasoning that this involves. According to the kind of Kantian constructivism they advance, practical reasoning is an activity with its own internal constraints. In particular, it is an autonomous form of activity. This means that it is both an activity that must be objective and one whose objectivity cannot come from an external source (compare Kant 1998).

This is where the constructivist interpretation of autonomy as a form of self-legislation comes into play. On this view, one is subject only to those laws that one has legislated oneself. As laws the results of self-legislation must take a special universal form that constitutes their objectivity. As self-legislated these laws are not external to one’s own practical reasoning and hence are able to provide a source of practical authority. In deciding what to do, one is at once aware of oneself as both the author and the subject of the normative demands on which one acts.

Some claim that this conception of practical objectivity succeeds where the ontological conception fails: namely, that it is able to ground the normative authority and motivational efficacy of practical judgments. The idea is that in order to be a rational agent one must act on certain kinds of universal principles or reasons. Strictly speaking, failure to act on these principles or reasons does not mean that one acts badly; rather, it means that one fails to be a rational agent or act at all. Hence, insofar as one is to count as a rational agent or act all, one must necessarily take these principles or reasons as decisive.

Does this account of practical objectivity also succeed in establishing constructivism as a genuinely novel and free-standing alternative to traditional approaches in metaethics? It is not clear. Although this form of Kantian constructivism does appear to establish an important distinction between constructivist and traditional understandings of practical reason, one might argue that these claims about the relation between objectivity and practical authority nonetheless belong to a first-order inquiry into the demands of practical reason (Hussain and Shah 2006). If this is the case, both realists and expressivists may plausibly argue that they are able to help themselves to the same resources that constructivists have developed. Moreover, by shifting the focus of metaethical debates away from traditional semantic and metaphysical concerns, one might object that these constructivists have merely changed the subject.

6. References and Further Reading

a. Classical Statements of Constructivism

Darwall, Stephen and Gibbard, Allan and Railton, Peter. 1992. “Toward a Fin de siecle Ethics: Some Trends.” Philosophical Review 101: 115-189.
- This article provides a history of developments in metaethics over the past hundred years and presents the state-of-the-art at the end of the Twentieth Century. It devotes brief, though influential, discussion to constructivism as a form of hypothetical proceduralism.
Kant, Immanuel. 1998. Groundwork of the Metaphysics of Morals. Trans. by Mary J. Gregor. New York: Cambridge University Press.
- Some defenders of constructivism find inspiration for their views in Kant’s discussion of autonomy as a self-legislation.
Korsgaard, Christine. 1996. The Sources of Normativity. Cambridge: Cambridge UP.
- Korsgaard argues that constructivism (or “procedural realism”) is the only metaethical position that can adequately accommodate the normative force of practical reasons and morality. This work is considered, along with the work of Rawls, one of the most important early presentations of the constructivism.
Milo, Ronald. 1995. “Contractarian Constructivism.” Journal of Philosophy 92: 181-204.
- Here, Milo coins the term “stance-dependence” and develops a metaethical interpretation of the constructivism one finds in Rawls’s work.
O’Neill, Onora. 1989. Constructions of Reason: Explorations of Kant’s Practical Philosophy. Cambridge: Cambridge University Press.
- O’Neill devotes a chapter of this work to responding to objections to Rawls’s view by sketching a more genuinely Kantian form of constructivism. She suggests that this view helps to illuminate the space that might exist between realist and relativist positions in metaethics.
Plato. 1997. Euthyphro. In Plato: Complete Works, John Cooper and D.S. Hutchinson (eds.), G.M.A. Grube (trans.), pp. 1-16.
- This short dialogue presents the famous Euthyphro Question.
Rawls, John. 1971/1999. A Theory of Justice. Cambridge, MA: Harvard UP.
- This is Rawls’s first full statement of justice as fairness; it includes detailed presentation of and arguments for the original position as the relevant choice procedure for determining principles of justice.
Rawls, John. 1980. “Kantian Constructivism in Moral Theory.” Journal of Philosophy 77: 515-572.
- This work is generally considered the locus classicus for contemporary discussions of constructivism in metaethics.
Rawls, John. 1993/1996. Political Liberalism. New York: Columbia UP.
- Here, Rawls presents his mature interpretation of justice as fairness as a form of political constructivism and contrasts this with his earlier Kantian interpretation.
Scanlon, T.M. 1998. What We Owe to Each Other. Cambridge, MA: Harvard University Press.
- Scanlon defends a form of local constructivism (“contractualism”) about morality.

b. Later Statements of Constructivism

Bagnoli, Carla. 2013a. “Constructivism about Practical Knowledge.” In Bagnoli, Carla (ed.): 153-182.
- Bagnoli rejects an ontological conception of objectivity for practical reason and develops an alternative “practical” conception of objectivity grounded in knowledge of oneself as an agent.
Engstrom, Stephen. 2013. “Constructivism and Practical Knowledge.” In Bagnoli, Carla (ed.): 133-152.
- Engstrom locates Kantian constructivism in an ancient tradition of theories of practical reason and contrasts this with two dominant modern approaches–rationalism and empiricism.
Galvin, Richard. 2010. “Rounding Up the Usual Suspects: Varieties of Kantian Constructivism in Ethics.” Philosophical Quarterly 61: 16-36.
- Galvin provides a taxonomy for categorizing different forms of Kantian constructivism and devotes special discussion to the objections that metaethical forms of the view face.
James, Aaron. 2007. “Constructivism about Practical Reasons.” Philosophy and Phenomenological Research 74: 302-25.
- James defends an explanatory yet non-procedural characterization of constructivism about practical reasons.
Korsgaard, Christine. 2003. “Realism and Constructivism in Twentieth-Century Moral Philosophy.” Journal of Philosophical Research APA Centennial Supplement: 99-122.
- Korsgaard argues that practical reasoning is not to be understood as a kind of “applied” theoretical reasoning. On her account, ethical concepts do not refer to facts that we come to know but, rather, to practical problems that agents must solve.
LeBar, Mark. 2008. “Aristotelian Constructivism.” Social Philosophy and Policy 25: 182-213.
- Lebar argues that an Aristotelian form of constructivism that grounds the truth of ethical judgments in our further judgments about what it is to live well avoids some of the standard objections to Kantian forms of the view and is generally a better framework for defending a thoroughgoing metaethical constructivism.
Lenman, James. 2010. “Humean Constructivism in Moral Theory.” Oxford Studies in Metaethics 5: 175-193.
- Lenman sketches a Humean form of constructivism that he argues may serve as a suitably naturalistic compliment to metaethical expressivism.
O’Neill, Onora. 2003. “Constructivism vs. Contractualism.” Ratio (new series) XVI 4 December, pp. 319-331.
- This article compares and contrasts Rawls’s constructivism with Scanlon’s contractualism and concludes that, once their underlying positions are more fully understood, it may make better sense to view Rawls as a contractualist and Scanlon as a constructivist than vice versa.
Silk, Alex. 2014. “Nietzschean Constructivism: Ethics and Metaethics for All and None.” Inquiry, pp. 1-37.
- Silk defends a Nietzschean form of constructivism that he thinks can both explain away an apparent tension in Nietzsche’s own writings and serve as a contender in contemporary metaethics.
Street, Sharon. 2006. “A Darwinian Dilemma for Realist Theories of Value.” Philosophical Studies 127, no. 1: 109-166.
- Here, Street argues that realism is incompatible with our best evolutionary explanations of how we came to have the evaluative attitudes we do.
Street, Sharon. 2008. “Constructivism about Reasons.” Oxford Studies in Metaethics 3: 207-45.
- Street sketches the form that a “thoroughgoing”, metaethical constructivism must take against the backdrop of more familiar “restricted” constructivist views, like those presented in the works of Rawls and Scanlon.
Street, Sharon. 2009. “In Defense of Future Tuesday Indifference: Ideally Coherent Eccentrics and the Contingency of What Matters.” Philosophical Issues (a supplement to Nous) vol. 19, ed. Ernest Sosa, pp. 273-298.
- Street works through a series of case studies of characters from recent moral philosophy (including an ideally coherent Caligula) that are supposed to present a challenge for a constructivist understanding of ethical objectivity. She argues that careful consideration of these cases shows them to be far less counter-intuitive than is often alleged by constructivists’ opponents.
Street, Sharon. 2010. “What is Constructivism in Ethics and Metaethics?” Philosophy Compass 5/5, pp. 363-384.
- Street presents a taxonomy of constructivist positions and argues that a standpoint characterization of constructivism offers a free-standing alternative to familiar metaethical positions: including realism, response-dependence theories, and expressivism.

c. Critics of Constructivism

Brink, David. 1989. Moral Realism and the Foundations of Ethics. Cambridge: Cambridge University Press.
- In an appendix to this work, Brink presents a careful exegesis of Rawls’s Kantian constructivism in the Dewey Lectures and argues that, contrary to what Rawls appears to argue, a coherence theory of justification in ethics does not commit one to anti-realism in metaethics.
Copp, David. 2013. “Is Constructivism an Alternative to Moral Realism?” In Bagnoli, Carla (ed.), pp. 108-132.
- Copp argues that the distinction between constructivism and realism is philosophically uninteresting and threatens to distract theorists from more pressing issues, like the nature of normativity and the relation between truth and cognitivism.
Enoch, David. 2009. “Can there be a Global, Interesting, Coherent Constructivism about Practical Reason?” Philosophical Explorations vol. 12, no. 3, pp. 319-339.
- Enoch articulates what it would be mean for there to be a global constructivist position and argues that such a view, though not strictly inconsistent, threatens to make practical deliberation impossible.
Fitzpatrick, William. 2005. “The Practical Turn in Ethical Theory: Korgaard’s Constructivism, Realism and the Nature of Normativity.” Ethics 115, pp. 651-691.
- Fitzpatrick reveals crucial ambiguities in Korsgaard’s argument for the claim that the normative force of practical principles can only be secured by constructivism–and, thus, requires one to reject realism. He argues that the most plausible ways of remedying these deficiencies in her argument turn out to be compatible with realism.
Gibbard, Allan. 1999. “Morality as Consistency in Living: Korsgaard’s Kantian Lectures.” Ethics 110, pp. 140-164.Hussain and Shah. 2006
- Gibbard objects that Korsgaard’s constructivism cannot secure substantive universal moral demands from merely formal requirements of consistency.
Hussain, Nadeem and Nishi Shah. 2006. “Misunderstanding Metaethics: Korsgaard’s Rejection of Realism.” In Oxford Studies in Metaethics vol. 1, R. Shafer-Landau (ed.), Oxford: Oxford University Press, pp. 265-294.
- Hussain and Shah argue that Korsgaard’s constructivism is best understood as a first-order ethical theory about the relationship between morality and practical reason and not as a free-standing alternative to realism or other familiar metaethical positions
Ridge, Michael. 2012. “Kantian Constructivism: Something Old, Something New.” In Lenman, James and Shemmer, Yonatan (eds.): 138-158.
- Ridge argues by elimination that constructivism is not a free-standing alternative in metaethics.
Shafer-Landau, Russ. 2003. Moral Realism: a Defense. Oxford: Oxford University Press.
- In this work, Shafer-Landau devotes a chapter to presenting and responding to constructivist arguments against realism.
Stern, Robert. 2012. “Constructivism and the Argument from Autonomy.” In Lenman, James and Shemmer, Yonatan (eds.): 119-137.
- Stern reconstructs and evaluates three distinct constructivist arguments against realism that take as their central premise the idea that realism is incompatible with agential autonomy.
Timmons, Mark. 2003. “The Limits of Moral Constructivism.” Ratio 16: 391-423.
- Timmons teases out the contours of a kind of contractualist constructivism that he finds in the work of Scanlon and argues that such a view is vulnerable to an objectionable form of relativism.

d. Collections of Essays

Bagnoli, Carla (ed.) 2013b. Constructivism in Ethics. Cambridge: Cambridge University Press.
Lenman, James and Shemmer, Yonatan (eds.) 2012. Constructivism in Practical Philosophy. Oxford: Oxford University Press.

e. Other Related Work in Metaethics

Blackburn, Simon. 1984. Spreading the Word: Groundings in the Philosophy of Language. Oxford: Clarendon Press.
- This is a comprehensive introduction to the philosophy of language in which, amongst other things, Blackburn defends a quasi-realist expressivism for evaluative language.
Blackburn, Simon. 1993. “Circles, Finks, Smells, and Biconditionals.” Philosophical Perspectives 7, pp. 259-279.
- Blackburn presents one standard objection to response-dependence theories.
Fitzpatrick, William. 2008. “Robust Ethical Realism, Non-Naturalism and Normativity.” In Oxford Studies in Metaethics, vol. 3, R. Shafer-Landau (ed.), Oxford: Oxford University Press, pp. 159-205.
- Fitzpatrick presents and defends a robustly realist and non-naturalist view in metaethics.
Johnston, Mark. 1989. “Dispositional theories of value.” Proceedings of the Aristotelian Society suppl. vol. 62, pp. 139-174.
- This is an example of a response-dependence account of evaluative concepts.
Mackie, J.L. 1977. Ethics: Inventing Right and Wrong. London: Penguin Books.
- The first chapter of this book is where Mackie introduces his famous Argument from Queerness and advances an error theory in metaethics.
Railton, Peter. 1996. “Moral Realism: Prospects and Problem.” In Moral Knowledge?: New Readings in Moral Epistemology. Oxford: Oxford University Press, pp. 49-81.
- Railton presents a general taxonomy for realism across different discourses and then asks whether there is a form of moral realism that is able to accommodate standard features of our moral experience.
Sayre-McCord, Geoffrey. 1988. “Introduction: The Many Moral Realisms.” In Essays on Moral Realism, Ithaca: Cornell University Press, pp. 1-23.
- This article presents another standard way of taxonomizing metaethical positions.
Smith, Michael. The Moral Problem. Oxford: Blackwell.
- Smith presents a set of platitudes about morality which together generate a puzzle for philosophers interested in metaethics.
Wallace, R. Jay. 2008. “Practical Reason.” The Stanford Encyclopedia of Philosophy, Edward N. Zalta (ed.)
This article discusses the various ways of understanding the standpoint of practical reason.

Author Information

Nathaniel Jezzi
Email: n.jezzi@abdn.ac.uk
University of Aberdeen
United Kingdom

Epistemology of Memory

We learn a lot. Friends tell us about their lives. Books tell us about the past. We see the world. We reason and we reflect on our mental lives. As a result we come to know and to form justified beliefs about a range of topics. We also seem to keep these beliefs. How? The natural answer is: by memory. It is not too hard to understand that memory allows us to retain information. It is harder to understand exactly how memory allows us to retain knowledge and reasons for our beliefs. Learning is largely a matter of acquiring reasons for changing views. But how do we keep reasons for the views we keep? The epistemology of memory concerns memory’s role in our having knowledge and justification. This branch of epistemology, unlike nearly all other branches, addresses our having knowledge and justification over time.

This article reviews the major epistemic roles that philosophers have assigned to memory. Section 1 surveys the nature of memory and the various memory systems. Some philosophers think the relation knowledge bears to at least one memory system is maximally strong: remembering just is a way of knowing. Section 2 covers this strong relation. Section 3 canvases the main problems that data on human memory pose to theories of justification and the central attempts to solve these problems. Section 4 discusses the historical and contemporary responses to two main skeptical challenges about memory.

The Nature of Memory
Memory and Knowledge
1. The Epistemic Theory of Memory
Memory and Justification
1. Problems
2. Responses
Memory and Skepticism
1. Memory and Accuracy
2. Memory and the Past
References and Further Reading

1. The Nature of Memory

Traditionally, philosophers have likened memory to a storehouse or a recording device. In the Theaetetus, Plato claims that the mind is analogous to a wax tablet. To perceive is to make an impression on the tablet, leaving behind an exact image or representation of what was perceived. Memory keeps the images and forgetting is a matter of losing them. In his Confessions, Augustine says perception deposits images of objects into the storehouse of memory and the process of recalling is the process of retrieving these deposits. Locke and Hume tell much the same story, as do many other philosophers up through the 20^th century.

On this storehouse view, memory stockpiles experiences and beliefs. Stored items may eventually degrade or become hard to access, but otherwise do not change (see Audi (1994: 420-1), Burge (1997: 321) and McGrath (2007: 13)). This view is commonsensical. It explains how it is that we are able to represent the past accurately in our thoughts and recollective experiences. It also explains why each of us, over time, tends to believe the same thing occurrently more than once. Yesterday, Maria believed that she went to high school in Santa Fe and she believes that today too.

During the 20^th century psychologists generally abandoned the storehouse view (see for example Bartlett (1932) and Schacter (1996, 2002)), though still thinking that memory stores information. They believe human memory processing is much more complicated than the mere depositing of items and later withdrawing them. Memory selectively stores information, expands part of it, combines it with background information and adds data from the context, in which the subject later retrieves the information. In other words, memory generally alters significantly what enters it. As a result, recollecting is not the retrieving, but rather the generating of representations of the past. Recollecting actually generates new beliefs about the past. Empirically minded philosophers of memory also have generally abandoned the storehouse view in favor of this generative view (see, for example, Debus (2010) and Michaelian (2011a, 2011b)), but epistemologists have been slower to shift models. Since this article covers the epistemological discussion of memory up to the beginning of the 21^st century, the storehouse view will generally be implicit.

Setting aside how exactly memory works, it will aid our epistemological discussion to get clearer on what memory is of or for. At least as far back as Henri Bergson (1896/1994) and Bertrand Russell (1921/1995), philosophers have recognized that there are different kinds of memory, or different memory systems and 20^th century psychological research has confirmed the philosophers’ distinctions. Talk of ‘memory’ simpliciter, as if there were a single, uniform faculty, can obscure it. Distinct memory systems allow us to do different things and consist of different networks of rule-governed psychological processes. Two memory systems that are important to distinguish are declarative memory and procedural memory. Declarative memory is memory of information and events. Procedural memory is memory for skills and of how to perform actions. Different parts of the brain house, on the one hand, our data about bicycle riding and our riding experiences and, on the other hand, our acquired talent for riding. This helps explain the familiar phenomenon of finding it easy to do something, yet hard to state instructions for doing it (think of swimming, playing a flute, or tying a shoe), or vice versa.

Declarative memory divides into semantic (or propositional) memory and episodic (or experiential) memory. Semantic memory is memory for propositions and episodic memory is memory for events, one has experienced. To see this distinction, consider how these types of memory can come apart. You remember that Plato taught Aristotle, but you do not remember Plato teaching Aristotle. How could you remember it? You were neither there nor did you witness it. I can remember that I was born in a hospital, but (mercifully) I cannot remember being born in a hospital.

Semantic memory underlies memories with propositional content; semantic memory claims are often of the form “S remembers that p”. Episodic memory underlies memories with a kind of non-propositional content; episodic memory claims are often of the form “S remembers x”. Semantic memory is by far the most discussed memory system in epistemology. This is understandable, since epistemology centers on states that have propositional content. Epistemologists primarily discuss what it is for S to know that p, or what it is for S to have justification for believing that p, or the like. They focus less on non-propositional knowledge and justification. The epistemology of memory, as a result, has chiefly been the epistemology of semantic memory.

But it is worth noting that neglecting to consider other memory systems can render our epistemological theories vulnerable. Some philosophers have objected to certain theories of propositional knowledge on the grounds that they do not accommodate the role that episodic memory plays in our believing (see, for example, Shanton (2011)). And deeper reflection on procedural memory may advance other debates in epistemology, such as debates concerning knowledge-how. Knowledge-how is a practical knowledge, what you have when you know how to swim or how to tie a shoe. There is debate about whether knowledge-how is reducible to knowledge-that. That is, there is debate about whether practical knowledge can be fully understood in terms of knowing various propositions. But procedural memory seems to ground our knowledge-how and it differs importantly from declarative memory (see Michaelian (2011a)). In fact, psychological research suggests that sophisticated procedural memory can be retained even when semantic memory is crippled (one artist entirely lost his knowledge of language due to brain-damage, having to relearn his native tongue altogether and yet he remembered how to paint! See Schacter (1996: 140-2)). Investigating procedural memory may help reveal that knowledge-how is not reducible to knowledge-that.

2. Memory and Knowledge

Most of the interesting features of memory’s relationship with knowledge originate in memory’s relationship with justification. Knowledge requires justification. As a result, when justification connects in interesting ways with a topic, knowledge shares those connections. This section covers what is perhaps the only unique connection between memory and knowledge.

a. The Epistemic Theory of Memory

Semantic memory is responsible for our remembering that something is true. Much philosophizing in the 20^th century tried to state necessary and sufficient conditions for propositions of the form S remembers that p. The theory that dominated that discussion is especially important in epistemology: the epistemic theory of memory (see, for example, Anscombe (1981), Ayer (1956), Audi (2002), Locke (1971), Malcolm (1963), Moon (2013), Owens (2000), Pappas (1980) and Williamson (2000)). Roughly put, the epistemic theory states that remembering is a kind of knowing. If S remembers that p, then S knows that p. Many philosophers go even further: if S remembers that p, then S knows that p because S previously knew that p. You remember that Plato taught Aristotle, and this is because in the past you came to know that Plato taught Aristotle, and because that past knowledge has contributed to your present knowledge. (Incidentally, Plato might even agree; he appears to endorse the epistemic theory of memory in the Theaetetus.)

If the epistemic theory of memory is correct, we might not remember as much as we think we do. Remembering requires knowing and the standards for knowing are not low. In particular, it is generally accepted among philosophers that S knows that p just in case p is true, S believes that p, believing that p is justified for S and it is not accidental that S’s justification for p gives S a true belief that p. Knowledge is a kind of justified true belief, a kind where the truth of the belief is tightly connected to its justification. When the connection is not tight, the belief might be “Gettiered” or true by sheer accident. If you see someone walking down the street dressed as a postal worker, you might justifiedly believe that your mail will be delivered soon. Suppose the person you see is not in fact a postal worker, but is merely testing out a Halloween costume. And suppose that, nonetheless, the mail will indeed be delivered soon; your regular postal worker is just around the corner, delivering mail to your neighbor. Your belief that the mail will be delivered soon is justified, but true only by coincidence. So you do not know that the mail will be delivered soon.

If remembering requires knowing, then remembering requires everything required for knowing. If any requirement is not met, one does not remember, but at best merely seems to remember. In other words, if you seem to remember that the keys are on the dresser, but they in fact are not there, or you have no reason to believe that they are there, or you simply deny that they are there, then you do not remember that they are there.

Why endorse the epistemic theory of memory? A main reason is that it fits our ordinary uses of “remembers” and “knows” (see Moon (2013)). Consider the following conjunctive claim: Sally remembers that she has visited Rhode Island, but she does not know that she has. This conjunction sounds odd and one plausible explanation of the oddness is that remembering requires knowing. The second conjunct denies something the first conjunct asserts, so the conjunction seems incoherent.

Here is a closely related reason for endorsing the epistemic theory. Remembering requires knowing just in case all of the following are true: remembering requires believing, remembering requires justification and remembering requires non-accidental truth. And we can argue, one at a time, that remembering does indeed have these requirements. For example, the best explanation of the oddness of certain conjunctive claims is that remembering requires believing. Consider: Peter remembers that he owes Paul a dollar, but he does not believe that he owes Paul a dollar. At least at first glance, it is hard to make sense of this. How could Peter remember that without believing it?

Andrew Moon (2013) proposes another reason for supposing that remembering requires believing. He claims that if S remembers that p, then S can use p as a premise in certain justifying inferences. But, Moon adds, a premise is usable in justifying inference only if believed. If you do not believe that all tigers are mammals and that all mammals are animals, you cannot use these propositions as premises for reasonably inferring that all tigers are animals. So, remembering requires believing. Similarly, Moon claims that remembering requires justifiedly believing. This is because a premise is usable in justifying inference only if justifiedly believed. And inferences based on remembered propositions are justifying. So, remembering requires justified belief.

However, Moon’s argument faces worries. Suppose S remembers that p, but also remembers that all experts deny that p. Can S use p as a premise in any justifying inferences? Perhaps not. If S cannot, then not all we remember is usable as a justifying premise and Moon has not shown remembering requires believing. Or, suppose S justifiedly does not believe p. Couldn’t S nonetheless have reason to believe that if she uses p (rather than not-p) in her inferences, she will be more likely to arrive at the truth (if, say, p is a scientific theory that is likely ‘false but approximately true’)? If so, S might be able to use p as a premise in justifying inference, without believing p. Even if remembering that p allows justified inference from p, justified inference from p would not guarantee belief that p. It would not follow that remembering requires believing.

While the epistemic theory may make sense of certain conjunctive claims, it faces many objections. As noted above, if remembering requires knowing, then remembering requires everything required for knowing: belief, justification and non-accidental truth. Arguments against the epistemic theory have tried to show that remembering is possible even when at least one of these three requirements for knowledge is not met.

Martin and Deutscher (1966) give a well-known example, in which there (allegedly) is remembering without believing. A painter paints a detailed farmyard scene. He believes he merely imagined the scene. However, it turns out that the painting captures an actual farmyard scene that the painter saw as a child. Unwittingly, the painter simply reproduced that scene. Martin and Deutscher (1966) add that the painter “did his work by no mere accident,” suggesting that the painter’s childhood experience caused him to bring to mind the scene (even though he believes that he merely imagined the scene). They conclude that this is a case of remembering without belief. Since knowing requires believing, this would be a case of remembering without knowing.

Martin and Deutscher’s conclusion may in a sense be right, yet their example may also not pose any problem for the epistemic theory. We can agree that the painter does not believe that the scene occurred. But exactly what is it the painter is remembering? It is plausible that, if he is indeed remembering something, he is remembering the scene or his visual experience of it. It is less plausible that he is remembering that the scene occurred or remembering the scene as having occurred. In other words, Martin and Deutscher may have given a case of remembering without believing, but the remembering is not semantic. It is episodic or some other sort of memory. If that is correct, then the example is no threat to the epistemic theory of memory, since that theory concerns only semantic memory.

Audi (1995) and Bernecker (2010: 75-7) appear to offer cases of remembering without the sort of justification that knowledge requires. Knowledge requires fairly strong justification and this justification must not be defeated. If Billy knows that there is a cookie on the table, then Billy has strong reason to believe that it is on the table. Even if he has some reason to doubt that there is a cookie on the table (he may have reason to suspect that his sister shaped some clay to look like a cookie), these doubts do not defeat his justification, when he knows that there is a cookie on the table.

Audi and Bernecker offer the following kind of case. Suppose you remember that Plato taught Aristotle. However, your friends go on to play a prank on you and give you convincing reasons to think Plato never taught Aristotle–Plato never existed and Aristotle had no teacher. You retain your belief, but the prank defeats your justification. Your justification is no longer strong enough for you to know that Plato taught Aristotle. Nonetheless, Audi and Bernecker would think, you remember that Plato taught Aristotle. So, they conclude, remembering does not require justification.

But why suppose that, after the prank, you still remember that Plato taught Aristotle? The answer is unclear. Is it because you still have a true belief, which you acquired in the past, even though you lack overall reason for keeping it? Why would that be sufficient for remembering? Unless an explanation is offered, we may not have reason to count the case as a counterexample to the epistemic theory of memory.

Bernecker (2010) describes a case, in which there appears to be remembering without non-accidental truth–that is, the remembered proposition is true by mere accident: you justifiably, but incorrectly believe that your friend has borrowed a certain book from the library. Later, your friend indeed checks out that very book. As a result, your belief is true, but by coincidence alone. Bernecker thinks you still count as remembering that your friend has borrowed the book from the library. If this is a case of remembering an accidentally true proposition, it is a case of remembering without knowing. But is the antecedent here true? Some philosophers (for instance Moon (2013),) see no reason to suppose that it is. If Bernecker can persuade us that it is in fact true, he will have provided a genuine counterexample to the epistemic theory of memory.

We have seen several attempts to show that remembering does not require knowing. Each attempt faces a similar problem: when knowledge is absent, it is unclear whether semantic remembering is present. Support for the claim that semantic remembering is indeed present has typically involved an appeal to intuitions that some critics apparently lack. But there may be a less controversial way of showing that remembering does not entail knowing. If epistemologists discard the storehouse view of memory and adopt the generative view, they may discover clearer kinds of cases, where propositions are remembered, yet not known, or at least not known in the past by the subject.

For debates about the epistemic theory of memory, it matters significantly whether remembering entails knowing. And it matters significantly for another debate in epistemology. Timothy Williamson (2000) has influentially argued that the concept of knowledge is fundamental in our thinking. Having the concept of knowledge crucially allows us to understand quite a bit of psychology and epistemology and we cannot fully explain knowledge in terms of other psychological or epistemological conditions and relations.

In support of this, Williamson (2000: 34) claims that “knowing is the most general factive stative attitude.” He means roughly that, if the state of having a certain kind of attitude toward p (like hearing that p or seeing that p) guarantees that p is true then being in that state guarantees that p is known. Knowing is the most general factive stative attitude, in that there is no way that S could be in the state of having a truth-guaranteeing attitude toward p without also knowing that p. Now, many philosophers think that remembering that p guarantees that p is true, even if remembering that p does not guarantee belief that p, strong overall justification for believing that p or the non-accidental truth of p. If they are right and remembering does not require knowing, then Williamson’s claim is incorrect. Remembering is factive, but is not knowledge, so knowledge is not the most general factive stative attitude. As a result, his argument would weaken; it is less clear that the concept of knowledge is fundamental to our thinking.

A closely related claim of Williamson’s may also be challenged, if remembering does not require knowing. Williamson says that all and only evidence is knowledge. More precisely, he says that S knows that p just in case p is included in S’s total evidence. It is plausible that if S remembers that p, then S’s total evidence includes p. If this is right and if remembering does not require knowing, then not all evidence is knowledge. Some of what we remember is evidence, yet not known.

3. Memory and Justification

For most debates in the epistemology of memory it does not matter whether remembering entails knowing. This is because most debates ultimately concern the connections between memory and epistemic justification. So, even if remembering does not entail knowing, there remains much to discuss. One neutral way of proceeding is to think about cases of apparent remembering: cases, in which a subject has a memory experience that p, or recollects that p, or recalls p as known or as true and so on. Even if the subject is not in fact remembering that p, memory may still justify the subject in believing that p. But how? And in exactly what circumstances?

a. Problems

In debates about epistemic justification, philosophers have construed memory mainly as a source of challenges. A main way to test a theory of justification is to see if it has the right implication in cases, where memory plays some special role. Philosophers apply this test most frequently in the debate about internalism and externalism in Epistemology.

It is controversial what these views even are, but here is a rough characterization. At a minimum, internalism states that mentally alike individuals are completely alike in their justification (see Conee and Feldman (2001)). Environmental differences by themselves make no difference to justification. So if, for example, you are justified in believing that there are boxes in the basement, that justification would remain even if your neighbor stole all the boxes from the basement. In order for your justification to change, your mental life would have to change–you would need to have a visual experience of an empty basement, or to seem to hear your spouse report that the basement is bare and so forth. You, and someone mentally just like you, are both justified in believing that there are boxes in the basement, even if only one of you has boxes, even if only one of you lives in a world with basements.

Externalism is the denial of internalism. It states that environmental differences can result in differences in justification, even if they do not result in mental differences. What is actually downstairs may matter. Or it may matter what is downstairs in nearby possible worlds. It may matter whether the particular way, in which you would form or keep the belief that there are boxes in the basement, tends to get at the truth.

Any theory of justification appears to face some challenge from facts about human memory. Externalists have argued that their view can overcome these challenges better than internalism can (see, for example, Bernecker (2008, 2010), Goldman (1999, 2009, 2011), Greco (2005) and Senor (1993, 2010)). A fine way to test a theory of justification is to check its implications about particular cases. A complete theory of justification will have implications about every particular case. The implications of a good theory of justification will also match our intuitive judgments about each case. The implications of a bad theory will not. That is, a good theory will typically imply that ordinary people, in ordinary circumstances are justified in believing what clearly trustworthy people tell them, in believing what their senses tell them about the world, in believing what seems to them to be the best explanation of what they have to go on and so on. A bad theory will not have all these implications and will imply that in some of these circumstances believing what is commonsensical is unjustified.

The circumstances of concern in this article all involve memory. The next sections cover particular kinds of circumstances that help test the implications of theories of justification. Think of each kind of circumstance as introducing a problem for these theories. If externalists are correct, and their view indeed has an easier time accommodating our intuitions and thereby solving these problems, then internalism is in bad shape. If, however, internalism can solve these problems easily enough, then it is much better off than many externalists suppose.

After introducing the problems we will consider the main responses to them. Of course, these are neither the only problems memory poses to theories of justification, nor the only responses. They are just the ones that have received the most attention.

i. The Problem of Forgotten Evidence

We are forgetful. We forget email passwords, where we put the car keys, anniversaries, acquaintances’ names and more. In some cases this gets us into trouble and in other cases it is harmless. Interestingly, when we do not forget and we keep beliefs about these things, we often nonetheless forget our original evidence for our beliefs. I cannot recall how I learned that Fred’s name is “Fred”–did a trustworthy friend tell me? Did Fred himself tell me? And you know that your email password is “iluvphilosophy,” but you cannot remember choosing it all those years ago. That password just seems familiar and using it works.

Forgetting is an epistemologically significant phenomenon. Here is one reason for that. In many cases, it seems that when we keep a belief, yet forget our original evidence for it, the belief remains justified. But this appears to conflict with certain theories of justification. In particular, it apparently conflicts with evidentialism, the view that the justified attitude for a subject toward a proposition is the attitude that fits the subject’s evidence (see Conee and Feldman (2008, 2011), Feldman and Conee (1985) and McCain (2014)). Understood broadly, your evidence is what you have to go on–your experiences, thoughts, feelings, background information and so forth. Evidentialism implies that if you are justified in believing that Fred’s name is “Fred”, then believing that fits what you have to go on. If you lose crucial evidence, however, believing that Fred’s name is “Fred” may no longer fit your evidence.

The Problem of Forgotten Evidence is the problem of accommodating our intuitions about justification, in cases where key-supporting evidence has been forgotten. There seem to be a lot of cases of this sort; we regularly forget our original evidence, while retaining the belief. Gilbert Harman (1986) is typically credited with developing this problem, though he never called it the “Problem of Forgotten Evidence”.

Which theories face the Problem of Forgotten Evidence? As mentioned above, evidentialism faces it. Traditionally, evidentialism has been understood to be a form of internalism. As a result, philosophers have understood the Problem of Forgotten Evidence to be a problem only for internalist theories of justification (for instance, Bernecker (2008, 2010)). But there are some evidentialist forms of externalism (see Comesaña (2010) and Goldman (2011)). These theories do not quite understand evidence to be all that you have to go on. Rather, evidence is understood more narrowly: it is just the stuff you have to go on, such that beliefs formed on its basis tend to be true (where contingent environmental factors partly determine what tends to be true).

So, some forms of externalism face the problem; evidence, even on the narrower understanding, can be forgotten. The problem also challenges any theory of justification that states that S’s having evidence for p is necessary for S’s being justified in believing that p. Some non-evidentialist externalist theories state roughly this necessary condition (see Alston (1988)). And finally, while the Problem of Forgotten Evidence is stated in terms of forgetting evidence, there is a more general problem here: how do we accommodate our intuitions about justification in cases, where whatever it is that originally conferred justification (be it evidence or something else) is forgotten? It could be that most theories of justification face this more general problem, which is discussed prior to Harman (1986) by George Pappas (1980).

Any theory that faces, but cannot solve, the Problem of Forgotten Evidence is doubtful. It is important to consider, then, possible solutions to the problem and to consider which theories have solutions available. Before considering these matters, two related problems about memory and justification should be mentioned.

ii. The Problem of Forgotten Defeat

Unfortunately, we forget more than just our original reasons for believing. We also forget our defeaters, that is, our reasons for not believing, or for doubting our reasons for believing. Sometimes we remember our original reasons, yet forget our defeaters. You remember your original reason for believing that there are boxes in the basement: this morning you saw what looked to you like boxes, in what looked to you like the basement. But suppose your spouse tells you that the children have since taken all of the boxes out of the basement, in order to build a fort outside. Or, your spouse tells you that you did not in fact see boxes in the basement–you saw them in the attic. If you forget what your spouse told you, yet you retain your belief that there are boxes in the basement, you have forgotten a defeater for your belief. On some theories of justification, your belief can still count as justified.

Another kind of forgotten defeat is this. Suppose you never had any reason to believe that there are boxes in the basement, but you believed it anyways. Some theories will count this belief as justified, once you forget that you never had any reason for it. Some philosophers find this result unacceptable (see Annis (1980), Goldman (1999, 2009), Greco (2005) and Huemer (1999)). The Problem of Forgotten Defeat is the problem of accommodating our intuitions about justification in cases where key-defeating evidence has been forgotten.

Far more theories face the Problem of Forgotten Defeat than face the Problem of Forgotten Evidence and that is one of the reasons why it is worth distinguishing these problems. Often these problems are conflated–in fact, the former problem has never been given a name before. The reason that many more theories face the Problem of Forgotten Defeat is this. Just about every theory of justification–even theories that deny that some evidence can play a justifying role–grants that some evidence, understood broadly, can play a defeating role. That is, nearly all theories agree that, even if having evidence cannot by itself justify, having evidence can by itself eliminate justification. Your visual experience of the cookie on the table is part of your evidence that there is a cookie on the table. Non-evidentialists will deny that your evidence on its own justifies believing that there is a cookie on the table. But typically they would grant that your evidence at least partially defeats any justification you had for believing that there is nothing on the table. So, cases of forgotten defeat challenge both evidentialist and non-evidentialist theories, although philosophers (for example, Annis (1980), Goldman (2001, 2009), Greco (2005) and Huemer (1999)) have presented the problem as though only internalist and evidentialist theories face it.

iii. The Problem of Stored Beliefs

The final problem centers on beliefs that are merely stored. (Some philosophers instead call these beliefs non-occurrent or standing or dispositional.) These are beliefs that are in no way before the subject’s mind. The believer is not thinking about, reasoning from, acting from, or having an experience concerning them or their content. Contrast these with occurrent beliefs, which are before the subject’s mind. When you are remembering that Plato taught Aristotle, or are telling others about it, your belief that Plato taught Aristotle is occurrent. At most other times–when you are sleeping, driving, playing chess, washing dishes–that belief is merely stored in memory. (This seems true on the storehouse model of memory, at least; on a generative model you may lack the belief of these other times).

A belief can be both occurrent and stored, just as a song can be both playing and stored on your computer. A merely stored song is stored but not playing. Similarly, a merely stored belief is stored, but not occurrent. It is commonsensical to attribute countless stored beliefs to people, who are in normal circumstances. A few moments ago, you had beliefs about chemistry, the first U.S. President, your childhood, panda bears, the Indian Ocean, the Super Bowl and countless other topics. A few moments ago almost all of these beliefs were not just stored, but were merely stored. And it is plausible that many of these beliefs were justified a few moments ago. The Problem of Stored Beliefs is the problem of explaining how the merely stored beliefs that seem justified are indeed justified (for simplicity, the discussion below for the most part omits the ‘merely’).

Thomas Senor (1993) and Alvin Goldman (1999) influentially pose this as a special problem for internalism about epistemic justification (though George Pappas (1980) briefly discusses the more general problem even earlier). Goldman (2011) and Matthew McGrath (2007) target internalist evidentialism in particular. Our occurrent experiences, thoughts and feelings might justify some of our stored beliefs, but not nearly enough. Our active mental lives, at any given time, simply do not bear on most of our stored beliefs. As a result, internalism appears unable to explain how all justified stored beliefs are justified. The same goes for evidentialism, since our evidence is too constrained to fit all our justified stored beliefs.

Andrew Moon (2012) directs a knowledge-version of the Problem of Stored Beliefs toward an evidentialist view concerning knowledge. The evidentialist view is that S knows that p at t only if S believes that p on the basis of evidence at t. We have stored beliefs while we sleep and we know many of these believed propositions. But while we sleep, these beliefs have no evidential basis, so knowledge does not require an evidential basis. Though Moon’s argument concerns just knowledge, we can offer a parallel argument that concerns justified belief. If his original argument is sound, then the parallel argument is too and so justified belief does not require an evidential basis.

Of course, externalist and non-evidentialist theories also face the Problem of Stored Beliefs. But these theories can avail themselves of non-mental, non-evidential resources, so they appear to have an easier time solving the problem. The next section reviews some of these resources.

It is important to distinguish the Problem of Stored Beliefs and the Problem of Forgotten Evidence. The phenomenon of forgetting is essential to the latter problem, but not to the former. We can store in memory our original evidence for a justified, merely stored belief. So there is no relevant forgotten evidence here, but some questions remain: what evidence could justify the belief? Is it the evidence that is stored in memory? How could it justify when it is not accessed? And the phenomenon of having stored beliefs is not essential to the Problem of Forgotten Evidence, but it is obviously essential to the Problem of Stored Beliefs. We can forget the original evidence for a belief that remains occurrent: if I am distracted and exhausted when we meet at a bustling party and you tell me that you are from Santa Fe, I might form the belief that you are from Santa Fe, but immediately forget that you just told me so. I might even be slightly puzzled as to why I find myself believing that you are from Santa Fe. My belief was justified when formed, but what justifies it a moment later, when I have forgotten my evidence?

It is clear, then, that the Problem of Stored Beliefs and the Problem of Forgotten Evidence are dissociable. Consequently it is a mistake to assume that they must share a solution. And it is possibly misleading to introduce the two problems simultaneously with a single example, as some philosophers do, without distinguishing them (see Goldman (2011), for instance). Doing so invites conflation of the problems.

b. Responses

The three problems discussed above are challenging. Tackling them has, however, helped inspire novel epistemological theses and observations about memory, some of which are general and may solve multiple problems, while others are more piecemeal and particular. This section looks first at the more characteristically evidentialist or internalist responses to the Problem of Forgotten Evidence and the Problem of Stored Beliefs and then at the more ecumenical responses. Replies to the Problem of Forgotten Defeat follow.

In answer to the Problem of Forgotten Evidence, Earl Conee and Richard Feldman (2001) point out that in ordinary cases, even when all of S’s original evidence for p is lost, S still has a host of evidence that could justify her in believing that p. This evidence could for example be rooted in induction, background information about memory or conscious recollection. You have forgotten why you originally believed that Fred’s name is “Fred”. But you have reason to believe that you tend to form beliefs with good reason, so you have evidence that you originally had good reason for your belief and this supports the belief. And you have reason to believe that your memory is fairly accurate. Since memory is supplying your belief about Fred’s name, you have justifying evidence for it. And if you are consciously recollecting that Fred’s name is “Fred”, then your experience is displaying that proposition as true, just as perceptual experiences display propositions about the external world as true. So, evidentialists of any stripe (internalist or externalist) can claim that there generally is justifying evidence in the central cases that motivate the Problem of Forgotten Evidence.

However, we do not usually have all of this evidence for a belief that is merely stored. A merely stored belief, by stipulation, is not being consciously recollected. Hence, the Problem of Stored Beliefs remains. Feldman (1988) and Conee and Feldman (2001) propose that justified stored beliefs can have “stored justifications”; S can recall some justifying evidence for p, when S has a justified stored belief that p. S’s evidence for p is stored (compare McCain (2014)). On this view, justified stored beliefs typically are not justified in the most fundamental sense, in the sense that justified occurrent beliefs typically are. When justified in the most fundamental sense, not all of the justifiers are stored, but rather some justifiers are occurrent: experiences, inferences and so on. If it is plausible that justified stored beliefs have the most fundamental kind of justification, then Conee and Feldman’s proposal will not solve the Problem of Stored Beliefs.

On a closely related proposal, the evidence and justifiers are occurrent. Call the proposal dispositionalism: dispositions of the right sort can justify (see Audi (1995), Conee and Feldman (2011), and Ginet (1975)). These dispositions can be memorial. Maria is disposed to recollect that she went to high school in Santa Fe. With the right cue, in ordinary circumstances, she will recollect that fact about her past. On dispositionalism, this disposition justifies her in believing that she went to high school in Santa Fe. The disposition is only occasionally manifest–she only occasionally thinks about where she went to high school–but she nonetheless has the disposition right now; it is not simply stored. As a result, the disposition can epistemically justify in the most fundamental sense right now. In some ways, dispositionalism parallels virtue ethics, which claims among other things that a virtue is a disposition that morally justifies certain actions, even when the disposition is not manifest.

Dispositionalism offers a promising solution to the Problem of Stored Beliefs. It also could solve the Problem of Forgotten Evidence: typically in cases, where S has forgotten her original evidence for her justified belief that p, S still has a disposition to recall p as known or as true. If this disposition justifies believing that p for her, then the Problem of Forgotten Evidence may disappear.

However, dispositionalism still needs crucial development. More must be said about exactly which dispositions justify believing exactly which propositions and how; and it would be good to have a principled way of determining which dispositions a given subject has, in order to see whether dispositionalism attributes to the subject justification for believing just the right propositions.

Conee and Feldman (2001) offer starting material for a final internalist, evidentialist-friendly solution to the Problem of Stored Beliefs. If we have stored beliefs, then these beliefs can justify other beliefs, including other stored beliefs. We can direct this proposal at the Problem of Forgotten Evidence too: stored beliefs can justify a belief, for which all original evidence has been forgotten.

A worry for this proposal is that we may not have enough stored beliefs to solve the two problems. We may not, in other words, have enough stored beliefs that could justify all justified stored beliefs and all beliefs that lack their original evidence. Goldman (2009) voices another worry: what ultimately justifies any stored belief? If a belief that p is justified by a stored belief that q, the latter belief should be justified too. It is hard to see how an unjustified belief can by itself justify another. But what justifies the belief that q? Does a stored belief that r justify it? If so, what justifies this stored belief that r? And so on.

A moderate form of coherentism could address Goldman’s worry: if S’s belief that p coheres with certain of S’s other beliefs, then S’s belief that p is justified. Coherence among stored beliefs can justify them. And coherence can justify belief in the face of forgotten evidence. Beliefs can have a special, mutually supporting relationship. However, coherentism has its costs; see Coherentism. But perhaps it could, if suitably defended, substantiate Conee and Feldman’s proposal. Still, if any stored belief is justifiedly based on something other than beliefs, then coherentism, even if correct, does not fully solve the Problem of Stored Beliefs (compare Moon 2012: 316-7).

The remaining responses to the problems are also available to externalists and non-evidentialists. A view nearly universally endorsed by discussants of the problems is what we might call preservationism (see Annis (1980), Bernecker (2008), Burge (1997), Goldman (2009, 2011) Naylor (2012), Owens (2000), Pappas (1980) and Senor (2010); some philosophers use ‘preservationism’ to refer to the view called ‘anti-generativism’ below). Roughly put, memory preserves the justification of the beliefs it preserves. More precisely, if S is justified in believing that p at t₁, and retains in memory a belief that p until t₂, then at t₂ S’s belief that p is prima facie justified. (The ‘prima facie’ here allows that the belief may not be justified overall if there are defeaters for it.) Your belief that Plato taught Aristotle was justified when you formed it: a professor or some other clearly credible source told you that Plato taught Aristotle. And you have kept that belief ever since. So, your belief has ever since been justified.

Preservationism seems to provide a simple solution to the Problem of Stored Beliefs. Regardless of whether a belief is stored rather than occurrent, it can retain its justification as long as memory preserves it. A stored belief can inherit justification from the past and this appears to solve the problem. And forgetting evidence does not block the inheritance. So, preservationism appears to solve the Problem of Forgotten Evidence. In fact, a main motivation for preservationism is that it seems to solve these problems at no cost.

But is preservationism true? Externalists think that it is true only if certain features that are external to the mind obtain. Process reliabilists, for example, think that preservationism is true just in case memory is reliable. Process reliabilism is roughly the view that justification of a belief depends entirely on the reliability of the process that forms or retains the belief. According to preservationism, beliefs retain justification by being retained in memory. As a result, reliabilists think memory must be reliable, in order for preservationism to be true. Since it is contingent whether memory is reliable, on reliabilism it is contingent whether preservationism is true. Reliabilists, who appeal to preservationism, in order to solve the Problem of Stored Beliefs and Problem of Forgotten Evidence, bear the burden of showing that memory is reliable.

And, if the storehouse view of memory is indeed incorrect, then preservationism appears vacuous unless modified. If memory typically alters the information entering it, it is hard to see how memory could preserve many beliefs over time–the beliefs would seem to be destroyed, once their exact content is no longer represented in memory. Preservationists, who reject the storehouse view, must explain either of two things: first, how memory can nonetheless tend to preserve beliefs, even though it tends to modify the content that enters it; or second, how something other than memory preserves beliefs. Pursuing either option may require developing a novel theory of belief.

Now for replies to the Problem of Forgotten Defeat: Richard Feldman (2005) and Matthew McGrath (2007) in a sense deny that this problem exists. When a defeater is forgotten, it is no longer relevant to what one is justified in believing. Once you forget that your spouse told you that the children removed all of the boxes from the basement, your spouse’s testimony ceases to defeat; you are overall justified in believing that there are boxes in the basement, as long as you still have some support for believing that.

Feldman and McGrath press their point: some attitude toward the proposition that there are boxes in the basement must be justified for you. But which? Abandoning belief in the proposition seems unjustified, since you no longer have reason to abandon your belief. And suspending judgment in the proposition seems unjustified, since you still have some justifying support for believing it–for example, a vivid recollection of what looked like boxes in what looked like the basement. The only potentially justified attitude remaining for you is belief. It is hard to see a competing option. If that is correct, then forgetting defeaters poses no problem. Nothing, which is forgotten, can defeat. (Of course, something other than the original defeater can still defeat. If for example you recall that you have forgotten a defeater for p, but cannot recall what it was, then arguably you still have a defeater for p: you have reason to believe that you had reason to doubt p. Having reason to believe this is itself reason to doubt p.)

Why, then, suppose that there even is a Problem of Forgotten Defeat? Why suppose that forgotten defeaters remain at all relevant to justification? The main reason is this: many philosophers think that memory, unlike perception, testimony, rational intuition and reasoning, is not a generative source of justification. Memory cannot create or strengthen justification. Rather, memory at most preserves justification that has been acquired from some source (such as perception, testimony and so on). Call this thesis about memory anti-generativism. It is a “garbage in, garbage out” view of justification and memory. An unjustified belief that enters memory remains unjustified, unless new reasons for the belief are acquired from some faculty other than memory. Anti-generativism is traditional and popular (see Annis (1980), Goldman (2009, 2011), Owens (2000) and Senor (2007)), and so are variants of the view that concern knowledge or warrant (see Audi (1997), Burge (1997), Dummett (1994) and Plantinga (1993)). With respect to knowledge, many philosophers think memory and testimony are alike in this way: coming to know that p via testimony requires that the testifier knows that p; testimony does not generate knowledge from non-knowledge.

Sometimes anti-generativism is called “preservationism”, but this is infelicitous. Anti-generativism primarily states a limit on memory: memory does not generate justification, or knowledge, or anything similar. The theory does not centrally concern memory’s power to preserve anything (unlike the theory that is called “preservationism” above, which does centrally concern memory’s preservative power).

If anti-generativism is plausible, then the theories of justification that are compatible with it may avoid the Problem of Forgotten Defeat and theories that are incompatible with it may on that account face the Problem of Forgotten Defeat.

However, generativism, the view that memory can generate justification, is increasingly common (see Audi (2002), Bernecker (2010), Lackey (2005, 2007), Huemer (1999), Michaelian (2011a) and Owens (1996)). Arguments for this view reveal that it comes in several forms. Jennifer Lackey (2005, 2007) and Sven Bernecker (2010) join ranks with Feldman and McGrath in thinking that memory generates justification in cases of forgotten defeat. But notice that the justification generated in these cases is overall, not prima facie. That is, since memory is responsible for the loss of a defeater, memory results in a balance of justification that favors belief. This is not yet to say that memory is creating new reasons for belief. One generativist view, then, is that memory can generate overall justification, even if it cannot generate prima facie justification.

Lackey offers other support for generativism: a subject’s memory can store information, which, in the past, the subject never paid attention to. If the subject recalls and attends to the information afterward, the subject can use it to form justified belief. The basis of this belief would be memory. Lackey builds her support with an example. Suppose that Clifford has his mind on many things, while he is driving. Later, his friend Phoebe asks him whether construction on the freeway has begun. Clifford then recalls seeing construction on his recent drive and only then forms a belief that construction on the freeway has begun. His belief is justified and memory is its source. Generativism follows. Still, as Bernecker (2010) observes, if Lackey is correct, she has only supported the generativist view that memory can generate doxastic justification. She has not shown that memory can generate propositional justification. In other words, at best Lackey demonstrates that memory can generate a reasonable belief, not that memory can generate new reasons for believing. Memory merely based a belief on a reason that perception generated.

Huemer (1999) and Michaelian (2011a) endorse the stronger thesis that memory can generate new reasons for believing. Huemer thinks that S’s seeming to remember that p can produce reason for S to form a belief that p, regardless of whether S already had reason to believe that p. And Michaelian attacks the storehouse view of memory, arguing that memory can generate new content and new belief in that content. The belief can have justification when formed as long as certain external conditions are in place (and Michaelian thinks they are). Consequently, sometimes, when memory generates justified belief, it generates justification for believing.

Since anti-generativism is controversial, the severity of the Problem of Forgotten Defeat is unclear. Interestingly, although it is primarily externalists who find the problem to be severe, the mix of internalist and externalist advocates of generativism is fairly even.

4. Memory and Skepticism

So far the surveyed discussion has assumed that memory plays some role in our actually having justification and knowledge, and the discussants have simply debated the margins of that role. But many early and mid-20^th century epistemologists worried about this assumption. Why believe memory has an important, or even any, epistemic role? Since this question may invite skepticism, call it a skeptical question for simplicity. Satisfactorily answering this sort of skeptical question about memory is a fundamental epistemological problem. In fact, according to Richard Fumerton (1985), answering it is the most fundamental epistemological problem. If memory has no epistemic role, then we have no reason to believe just about anything we ever learned, or think we learned, at any time in the past.

What is more, memory appears to be involved not just in our retaining what we have learned, but in our very learning. When Chloe tells you “I am changing the oil in my car today,” you use memory even to understand what she is saying–some memory system is responsible for your applying the concepts that make “changing” and “oil” and “car” (and so on) intelligible to you. And you use memory not just to grasp the meaning of words, but also of sentences. Memory is holding fixed in your mind the beginning of Chloe’s statement when she finally says the word “today,” allowing your mind to string concepts together in a way that yields in you a mental representation of what she has testified. Without memory there is no understanding of what is testified. If memory has no epistemic role, then it is hard to see how we could even learn from testimony in the present. Memory seems similarly involved in intuition, reasoning, introspection and perception. Accordingly, it is hard to see how we could learn from those sources if memory plays no epistemic role.

Philosophers have sharpened the general skeptical question about memory into more challenging related sub-questions. This section discusses responses to two of these sub-questions. Answering them is not easy, since they introduce foundational problems that do not arise with other kinds of skepticism. Yet, oddly, philosophers exploring contemporary skepticism have mostly neglected the issue of memory skepticism. Half way through the 20^th century C. I. Lewis (1946) thought the issue was so significant that the level of silence on it even then was “a bit of a scandal.” And the times have not changed.

a. Memory and Accuracy

Consider

(MR) Memory is reliable.

MR states that memory tends to get things right and that it is generally accurate. It does not state that memory is perfectly accurate. A first skeptical question is: why believe MR? If there is no reason to believe MR, then memory may not provide (either by preserving or by generating) any support for what it represents as true in a given case. If you have no sense as to whether, say, a particular political blog tends to get things right, then there may be no sense in believing anything on the mere basis of the blog. Don Locke (1971) thinks that if we have no reason to believe MR, we have no knowledge via memory at all. If he is correct, it may be critical that we identify support for MR.

It is not at all clear that Locke is correct. But even if he is, our troubles are not as severe as they might seem. Suppose we have little to say positively in answer to the first skeptical question. We may still have reason to believe that memory is often correct (correct, say, around 40% of the time), and even that in the kinds of cases we care about it is usually correct. Further, having no reason to believe MR is not the same as having reason to believe MR is false. Having no reason to believe MR may just require us to be neutral about it. Granted, process reliabilists, who must be neutral about MR, may be in trouble. They may have to suspend judgment about whether any given belief that memory preserves is justified, since they must suspend judgment about whether such a belief is preserved by a reliable process. But on other theories of justification, perhaps we remain reasonable in thinking that memory justifies.

Still, it would be somewhat troubling if there were no reasons to believe MR. It would be strange for us to rely so heavily in our reasoning and behavior on something we have no reason to believe is typically accurate. It is worth considering MR’s status for us.

Locke considers the following line of support: doubting MR is self-defeating. To raise doubts about MR requires the use of memory. Raising relevant doubts requires citing examples, in which memory has erred. But memory alone can supply these examples. If these examples impugn MR, it is because memory supports believing something: the fact that it has erred in certain cases. So, the mere attempt to undermine memory itself vindicates memory in a way.

This result is not clearly worth celebrating. We have merely established that memory supports believing that it itself fails here and there. And even if this result is established, it yields no support for MR. Memory may sometimes support belief, but it could nonetheless typically fail to support belief and could be unreliable. But the self-defeat consideration reveals something unique about memory skepticism: using or contemplating arguments for or against it requires the use of the very faculty being scrutinized. We cannot help but use memory, in order to explore memory skepticism. Nothing parallel is true about, say, external-world skepticism. It is not the case that thinking about or offering an argument for and against it must occur via our perceiving something external. Thus, memory skepticism is especially thorny: addressing it uniquely, unfailingly involves some kind of circularity. Thomas Senor (2010) claims that there is no non-circular “demonstration” of MR. If this is true, any demonstration of MR may be suspect.

Richard Brandt (1955) offers an alternative line of support: MR is the best, and only, explanation of our data. What are our data? For Brandt it is our present experience and our having a host of cohering beliefs about the past and about science. According to Brandt (1955: 93), we have these beliefs, because our brains have over time interacted with the world in a truth-conducive way and “the only acceptable theory is one which asserts that a large proportion of our memory beliefs are veridical. No alternative to such a theory has been proposed; nor can one imagine what one would be like.” If MR is the only explanation of our data, MR is by default the best explanation and it may thereby be credible. Contra Senor and others, Brandt thinks this support for MR is non-circular, since it does not take for granted that any recollections are accurate.

But there is reason to think Brandt’s support for MR is indeed circular. To support MR, Brandt makes an explanatory inference based on our data. But why suppose we have the very data he thinks we have–why suppose we have a host of cohering beliefs? That we have them is not wholly manifest to us at one time. We must use memory, in order to appreciate it. We think about our various beliefs, how they fit together, how snug that fit is and we make an inference about how our beliefs cohere. This thinking and inferring is not instantaneous. It unfolds over time and (we presume) memory holds fixed and supports the parts that (we also presume) have already unfolded. So there is a kind of circularity: memory is used in establishing the data MR allegedly explains (see BonJour (2010: 169-171) and Plantinga (1993: 61-4)). If this circularity is vicious, Brandt’s argument yields no new reason for believing MR.

Another objection to Brandt’s argument is that MR is not the only explanation of our data. Bertrand Russell (1921/1995) provides a famous rival hypothesis: we and the world came to exist only five minutes ago and it merely appears that everything is much older. In each of us is a package of cohering beliefs about the past. And we find rings in trees, rust on cars and ruins in Rome. All of this is misleading. Everything is new. As unpalatable as this hypothesis is, it is not easy to disprove. At any rate, it is a rival explanation of our data. Oddly, Brandt actually considers a Russellian hypothesis, but dismisses it as a fantasy, wholly lacking “evidential foundation”. But there is no need for evidence for Russell’s hypothesis, beyond this: it fits the data. Since it does, MR has an explanatory rival. We cannot assume MR is the best explanation. We must do the hard work of showing it is better than Russell’s hypothesis.

b. Memory and the Past

Our target has shifted from defending MR to defending something more basic. Memory could be massively misleading. For any view about the past, why suppose it is even approximately right? That is our second skeptical question. The first skeptical question challenges our view about how memory performs overall. It still allows that memory provides reason to accept some appearances about the past. The second question goes further, probing each appearance. It challenges our view about memory’s performance in each given case. Answering this question well is especially demanding.

Senor (2010) claims that most philosophers agree that Russell’s hypothesis has not been refuted. Regardless of whether Senor and these philosophers are correct, note that the demand here is greater than just refuting the particular hypothesis that Russell offered. Russell’s exact hypothesis may be bad: it seems ad hoc and uninformative. The present demand is to show why all hypotheses like Russell’s are inferior. One hypothesis similar to his is that the world and its inhabitants all popped into existence six minutes ago. Another is that the world is as old as it seems, but just its inhabitants popped into existence five minutes ago. In order to reasonably hold our commonsensical beliefs about the past, we must have reason to reject each skeptical hypothesis that is incompatible with the truth of our commonsensical beliefs. Moreover, we must have reason to think that what we commonsensically believe explains better our data than the entire disjunction of skeptical hypotheses does.

Russell proposes a pragmatic answer to the second skeptical question. Taking memory appearances at face value is extremely practical. We cannot help but do it and it works. Skepticism, therefore, poses no genuine threat. This answer appeals to something like the practical rationality of believing that the past really is how it seems. But this answer tells us nothing about the epistemic rationality of believing anything about the past. Even if Russell is right, we do not have on that account a key ingredient for knowledge of the past: epistemic justification.

One family of replies to the second skeptical question uses transcendental arguments to reject Russell’s hypothesis. A transcendental argument is of this form: A obtains; A is impossible in the absence of B; (therefore) B obtains. Norman Malcolm (1963) and Sydney Shoemaker (1967) offer the following transcendental argument: we know how to make past-tense statements; this competence requires that most of these statements are true; (therefore) most of these statements are true. Since these statements express our beliefs about the past, most of these beliefs are true. Not only does MR follow, but it also follows that the past tends to fit our expressed views about it.

The general idea behind this argument is that one’s having skill at using a kind of statement is incompatible with one’s systematically misusing it. If Elmer sincerely refers to toasters, clouds and orange things as “rabbits,” then Elmer must not be using that word to talk about rabbits. There must be an alternative way of understanding his “rabbits” expressions, such that they tend to be true. Now, we are competent at making statements about the past. It follows that most of these statements are true and so are our corresponding beliefs.

Don Locke (1971: 135-7) offers a transcendental argument for MR (compare Lewis (1946)), which may also answer the second skeptical question. The fact that we have knowledge at all and that we inquire, requires that we have memory knowledge. And we in fact know things and we in fact inquire. (In support of this claim we might note that it seems readily proven: are we inquiring? Yes!). So there is memory knowledge. And, as noted earlier, Locke thinks that if there is memory knowledge, then MR is true. So, he concludes that MR is true. If Locke is right, it follows that Russell’s hypothesis is false. The world did not come into existence five minutes ago. Many hypotheses like Russell’s will also be false. This may answer the second skeptical question. Our reason to suppose that a given belief about the past is true is that it is of a class of beliefs that tend to be correct.

Some philosophers doubt that transcendental arguments can rationally support any anti-skeptical conclusions. But even if some can, the transcendental arguments covered here are questionable. Malcolm and Shoemaker take it as a datum that we know how to make past-tense statements. But why accept the datum? In answer, we can at best cite the kinds of statements we can recall ourselves competently making. And why suppose that the past resembles those recollections? If we popped into existence five minutes ago, those recollections are misleading. If the transcendental argument simply assumes that the recollections are accurate, then the argument fails to generate support for believing that they are accurate.

Similarly, in reply to Locke: why suppose we inquire? You might blush with embarrassment and note that to ask that question is to inquire. But why suppose a question has been asked? Observing inquiry may rely on memory. Perhaps we cannot even think at all or observe a case of inquiry all in one moment. Perhaps thought and observation are always extended in time and we may need to use memory, in order to observe the temporal extension of anything.

As noted, a transcendental argument is of the form: A obtains; A is impossible in the absence of B; (therefore) B obtains. The replies to the transcendental arguments here question the first premise. To support anything as data, we may need to use memory. If that is correct, it may seem viciously circular then to use this data, in order to support either memory or beliefs about the past.

However, one might think that this reveals that memory skepticism is indeed self-defeating. Merely raising a skeptical challenge to MR or to views about the past uses some data about memory or the past. This data may include the fact that observing inquiry requires memory, or that Russell’s hypothesis is compatible with one’s having a given recollection. But if we need to use memory in order to support any data, then raising a skeptical challenge about memory uses memory. So, anyone who offers such a challenge undermines her own position. If memory truly supported nothing, skepticism could have no support.

This line of reasoning notes a conflict between an activity (supporting memory skepticism) and a theory (memory skepticism). Unfortunately, even if there is a conflict, the theory may still be correct (compare Bernecker (2008: 130-1) and Fumerton (1995: 52)). Why believe memory skepticism is false? Even if supporting memory skepticism is self-defeating, it may still be true. And, we may still be justified in believing memory skepticism, but simply unable to demonstrate its support.

Finally, Sven Bernecker (2008: 131-3) attempts to “disarm” Russell’s hypothesis and skepticism about the past by taking a relevant alternatives approach (see Contextualism in Epistemology). Bernecker thinks memory can provide us with knowledge about the past, even if we do not know that there is a past and even if we do not know that Russell’s hypothesis is false. Here is why: a table can be flat, even if it appears bumpy under a microscope. The table is not relevantly bumpy, so it counts as flat. Bernecker thinks knowledge is similar to flatness. S’s knowing that p does not require that S is able to know every alternative to p to be false. S might know that p and yet be unable to rule out some situation in which not-p is true. All S must be able to rule out are the relevant alternatives to p–the relevant situations in which not-p is true.

For example, in order for you to know that Plato taught Aristotle, you must be able to rule out the relevant alternatives to that fact. One relevant alternative is that Socrates alone taught Aristotle. And you can rule this out: you have reason to believe that Socrates swigged his poisoned hemlock years before Aristotle’s birth. Although Russell’s hypothesis is an alternative, to what you believe about the past and you may be unable to rule out, Bernecker thinks it is ordinarily an irrelevant alternative. So memory can provide knowledge of the past even when you cannot rule out Russell’s hypothesis.

Bernecker’s reply faces difficult objections. It is not obvious that knowledge is sufficiently like flatness. Supposing it is, it is unclear that Russell’s hypothesis is ordinarily irrelevant. And, supposing it is, why agree that we can rule out the alternatives that are relevant? What enables you to rule out that Socrates alone taught Aristotle–evidence from memory? The strength of this evidence should be in question if Russell’s hypothesis is not yet ruled out. But, supposing we can rule out the relevant alternatives, Bernecker’s reply may leave us unsatisfied. At best it secures for us bits of knowledge about the past, yet it does not secure knowledge that the past exists or knowledge that Russell’s hypothesis is false. The latter two results seem simply to concede victory to an unpalatable skepticism. And they pair oddly with the former result–how could we simultaneously have knowledge about the past from memory and yet lack knowledge from memory that the past exists? Whatever ultimately explains the one, suggests the other is false.

It is clear that satisfactorily answering the skeptical questions is not easy. There have been other attempts to answer them, but none more promising or developed than those mentioned here (for additional discussion, see Locke (1971) and Bernecker (2008)). Since memory skepticism threatens most of our knowledge and justification, failing to rule it out would be uncomfortable. Still, for two reasons it would be premature to despair. First, even if we cannot show that memory skepticism is false, it is unclear what is thereby threatened or what we are thereby required to believe, if anything. This is because even if memory skepticism is true, it is unclear what we can conclude (compare BonJour (2010: 170-1)). If memory must support any justifying inference or data about the past and memory cannot support, then what can we are justified in inferring from the truth of memory skepticism? It is hard to say. Second, we should not confuse our failing to disprove memory skepticism with our having no reason to believe anything about the past or having reason to deny MR. It could very well be that memory is reliable and justifying, but that we simply have a hard time showing it.

5. References and Further Reading

Alston, William P. “An Internalist Externalism.” Synthese 74.3 (1988): 265–283.
- Offers an externalist theory of justification that respects key epistemic roles of mental phenomena.
Annis, David B. “Memory and Justification.” Philosophy and Phenomenological Research 40.3 (1980): 324–333.
- An article weighing in on several main issues concerning memory and justification, including preservationism, anti-generativism and the Problem of Forgotten Defeat.
Anscombe, G. E. M. Collected Philosophical Papers, Vol. 2: Metaphysics and the Philosophy of Mind. University of Minnesota Press, 1981.
- The chapter “Memory, ‘Experience’, and Causation” discusses the relationship between remembering and knowledge.
Audi, Robert. “Dispositional Beliefs and Dispositions to Believe.” Nous 28.4 (1994): 416–434.
- Distinguishes beliefs that are stored (dispositional beliefs) from inclinations to form beliefs. Likens memory to a computer.
Audi, Robert. “Memorial Justification.” Philosophical Topics 23.1 (1995): 31–45.
- Discusses from an internalist perspective a number of topics concerning memory.
Audi, Robert. “The Place of Testimony in the Fabric of Knowledge and Justification.” American Philosophical Quarterly 34.4 (1997): 405–422.
- Discusses a version of preservationism about knowledge and memory’s similarity to testimony in epistemology.
Audi, Robert.. “The Sources of Knowledge.” The Oxford Handbook of Epistemology. Ed. Paul K. Moser. Oxford University Press, 2002. 71–94.
- Defends generativism and an epistemic theory of memory.
Augustine. Confessions. Ed. H. Chadwick. Oxford University Press, 1991.
- In Book X, describes memory in terms of a storehouse.
Ayer, A. J. The Problem of Knowledge. Vol. 8. Harmondsworth, 1956.
- Chapter 4 endorses the epistemic theory of memory and other connections between memory and knowledge.
Bartlett, Frederic. Remembering: a Study in Experimental and Social Psychology. Cambridge University Press, 1932.
- Commonly thought to be the first work in psychology to present memory as generative.
Bergson, Henri. Matter and Memory. Trans. N.M. Paul and W.S. Palmer. Zone Books, 1896/1994.
- Early distinction of memory systems by a philosopher.
Bernecker, Sven. Memory: A Philosophical Study. Oxford University Press, 2010.
- One of the only recent philosophical monographs on memory, this book develops themes from Bernecker’s earlier work, defends generativism and attacks the epistemic theory of memory.
Bernecker, Sven. The Metaphysics of Memory. Springer, 2008.
- Thorough philosophical discussion of many metaphysical and some epistemological issues bearing on memory, including skepticism about memory and problems for internalism.
BonJour, Laurence. Epistemology: Classic Problems and Contemporary Responses. Rowman & Littlefield Publishers, Inc., 2010.
- Written for a general philosophical audience, chapter 8 introduces many problems in the epistemology of memory.
Brandt, Richard B. “The Epistemological Status of Memory Beliefs.” Philosophical Review 64.1 (1955): 78–95.
- Provides an inference to the best explanation reply to memory skepticism.
Burge, Tyler. “Interlocution, Perception, and Memory.” Philosophical Studies 86.1 (1997): 21–47.
- Endorses preservationism and anti-generativism, while alleging parallels between memory and testimony.
Comesaña, Juan. “Evidentialist Reliabilism.” Noûs 44.4 (2010): 571–600.
- States an evidentialist version of process reliabilism.
Conee, Earl, and Richard Feldman. “Evidence.” Epistemology: New Essays. Ed. Quentin Smith. Oxford University Press, 2008.
- The best-known defenders of evidentialism develop and clarify several aspects of their theory.
Conee, Earl, and Richard Feldman. “Internalism Defended.” American Philosophical Quarterly 38.1 (2001): 1–18.
- Defends internalism from the Problem of Forgotten Evidence, the Problem of Stored Beliefs and other objections.
Conee, Earl, and Richard Feldman. “Replies.” Evidentialism and Its Discontents. Ed. Trent Dougherty. Oxford University Press, 2011.
- Proposes a dispositionalist solution to the Problem of Stored Beliefs.
Debus, Dorothea. “Accounting for Epistemic Relevance: A New Problem for the Causal Theory of Memory.” American Philosophical Quarterly 47.1 (2010): 17–29.
- Considers generative aspects of memory, while criticizing Martin and Deutscher’s rival to the epistemic theory of memory.
Dummett, Michael. “Testimony and Memory.” Knowing From Words. Ed. A. Chakrabarti and B. K. Matilal. Kluwer, 1994. 251–272.
- Likens memory to testimony and endorses anti-generativism.
Feldman, Richard. “Having Evidence.” Philosophical Analysis. Ed. D. F. Austin. Kluwer Academic Publishers, 1988. 83–104.
- Proposes that justified stored beliefs usually only have stored justifications.
Feldman, Richard. “Justification Is Internal.” Contemporary Debates in Epistemology. Ed. Matthias Steup and Ernest Sosa. Blackwell, 2005. 270–84.
- Defends internalism from the Problem of Forgotten Defeat.
Feldman, Richard, and Earl Conee. “Evidentialism.” Philosophical Studies 48.1 (1985): 15–34.
- The most influential paper to state and advocate evidentialism.
Fumerton, Richard A. Metaepistemology and Skepticism. Rowman & Littlefield, 1995.
- Brings out the difficulty of satisfactorily rejecting memory skepticism.
Fumerton, Richard A. Metaphysical and Epistemological Problems Of Perception. Lincoln: University Nebraska Press, 1985.
- Highlights the importance of the epistemology of memory to epistemology in general.
Ginet, Carl. Knowledge, Perception, and Memory. Vol. 26. D. Reidel Pub. Co., 1975.
- Perhaps the first contemporary statement of dispositionalism.
Goldman, Alvin I. “Internalism, Externalism, and the Architecture of Justification.” Journal of Philosophy 106.6 (2009): 309–338.
- Argues for externalism and against internalism in light of the epistemology of memory.
Goldman, Alvin I. “Internalism Exposed.” Journal of Philosophy 96.6 (1999): 271–293.
- An influential criticism of internalism that has drawn attention to the Problem of Forgotten Evidence and the Problem of Stored Beliefs.
Goldman, Alvin I. “Toward a Synthesis of Reliabilism and Evidentialism? Or: Evidentialism’s Troubles, Reliabilism’s Rescue Package.” Evidentialism and Its Discontents. Ed. Trent Dougherty. Oxford University Press, 2011.
- Continues to press several objections to internalism rooted in the epistemology of memory and sketches a version of reliabilism that incorporates evidentialist insights.
Greco, John. “Justification Is Not Internal.” Contemporary Debates in Epistemology. Ed. Matthias Steup and Ernest Sosa. Blackwell, 2005. 257–269.
- Attacks internalism in light of the Problem of Forgotten Defeat, among other problems.
Harman, Gilbert. Change in View. MIT Press, 1986.
- Chapter 4 responds to the Problem of Forgotten Evidence and has popularized it.
Huemer, Michael. “The Problem of Memory Knowledge.” Pacific Philosophical Quarterly 80.4 (1999): 346–357.
- Endorses the Problem of Forgotten Defeat, yet also a form of generativism.
Lackey, Jennifer. “Memory as a Generative Epistemic Source.” Philosophy and Phenomenological Research 70.3 (2005): 636–658.
- Argues for generativism and against anti-generativism.
Lackey, Jennifer. “Why Memory Really Is a Generative Epistemic Source: A Reply to Senor.” Philosophy and Phenomenological Research 74.1 (2007): 209–219.
- Defends her earlier arguments for generativism and against anti-generativism from Thomas Senor’s objections.
Lewis, Clarence I. An Analysis of Knowledge and Valuation. Open Court, 1946.
- Early and influential discussion of memory skepticism.
Locke, Don. Memory. Vol. 13. Macmillan, 1971.
- One of the few book-length philosophical discussions of memory. Nearly all replies to memory skepticism on offer are scrutinized and a transcendental argument against memory skepticism is advanced.
Malcolm, Norman. Knowledge and Certainty. Englewood Cliffs, N.J., Prentice-Hall, 1963.
- Defends the epistemic theory of memory and a transcendental argument against memory skepticism.
Martin, Charles B., and Max Deutscher. “Remembering.” Philosophical Review 75.April (1966): 161–96.
- One of the first criticisms of the epistemic theory of memory. Presents an influential rival theory.
McCain, Kevin. Evidentialism and Epistemic Justification. Routledge, 2014.
- Develops and defends what may be the most complete and detailed statement of an evidentialist, internalist theory of justification. Advocates a “stored justifications” type reply to some problems in the epistemology of memory.
McGrath, Matthew. “Memory and Epistemic Conservatism.” Synthese 157.1 (2007): 1–24.
- Uses the epistemology of memory, in order to criticize evidentialism and to defend a rival internalist theory of justification.
Michaelian, Kourken. “Generative Memory.” Philosophical Psychology 24.3 (2011a): 323–342.
- Assembles wide-ranging cognitive psychological research in an effort to challenge the storehouse model of memory and to advance a generative model. Sketches how reliabilism might accommodate a generative model.
Michaelian, Kourken. “Is Memory a Natural Kind?” Memory Studies 4.2 (2011b): 170–189.
- Empirically informed philosophical discussion of the various memory systems. Denies that memory is a natural kind.
Moon, Andrew. “Knowing Without Evidence.” Mind 121.482 (2012): 309–331.
- Presents to evidentialism a knowledge version of the Problem of Stored Beliefs centered on the basis of stored beliefs.
Moon, Andrew. “Remembering Entails Knowing.” Synthese 190.14 (2013): 2717–2729.
- Argues that remembering entails knowing and criticizes Bernecker’s attempts to show otherwise.
Naylor, Andrew. “Belief from the Past.” European Journal of Philosophy 20.4 (2012): 598–620.
- Adopts preservationism, while arguing for a theory about what it is to believe one did something from having done it.
Owens, David J. “A Lockean Theory of Memory Experience.” Philosophy and Phenomenological Research 56.2 (1996): 319–32.
- One of the first arguments for a kind of generativism.
Owens, David J. Reason without Freedom: The Problem of Epistemic Normativity. Routledge, 2000.
- Discusses preservationism, a kind of anti-generativism and the epistemic theory of memory.
Pappas, George S. “Lost Justification.” Midwest Studies in Philosophy 5.1 (1980): 127–134.
- An early and underappreciated statement of many problems in the epistemology of memory, including the Problem of Forgotten Evidence and the Problem of Stored Beliefs.
Plantinga, Alvin. Warrant and Proper Function. Oxford University Press, 1993.
- Endorses anti-generativism about warrant and criticizes inference to the best explanation replies to memory skepticism.
Plato. Theaetetus. Trans. H.N. Fowler, Loeb Classical Library. London: William Heineman, 1921.
- Among Western philosophy’s earliest work in the epistemology of memory, endorsing a storehouse model and epistemic theory of memory.
Russell, Bertrand. The Analysis of Mind. London: Routledge, 1921/1995.
- One of the first discussions of memory skepticism, famously hypothesizing that we came to exist only five minutes ago.
Schacter, Daniel L. Searching for Memory: The Brain, the Mind, and the Past. New York: Basic Books, 1996.
- Summarizes a considerable amount of psychological research on memory for a popular audience, with many citations for further reading. Explains how a generative model of memory, rather than a storehouse model, better fits the research.
Schacter, Daniel L. The Seven Sins of Memory: How the Mind Forgets and Remembers. Boston: Mariner Books, 2002.
- Presents for a general audience a wealth of findings on the psychology of memory, exploring whether the general limits of human memory constitute defects. Provides additional references for further reading and supports the generative model of memory.
Senor, Thomas D. “Internalistic Foundationalism and the Justification of Memory Belief.” Synthese 94.3 (1993): 453–476.
- Presents the Problem of Stored Beliefs as a special problem for internalism.
Senor, Thomas D. “Memory.” A Companion to Epistemology. Ed. Jonathan Dancy, Ernest Sosa, and Matthias Steup. Wiley-Blackwell, 2010.
- Concisely surveys many issues in the epistemology of memory.
Senor, Thomas D. “Preserving Preservationism: A Reply to Lackey.” Philosophy and Phenomenological Research 74.1 (2007): 199–208.
- Defends anti-generativism from Lackey’s criticisms.
Shanton, Karen. “Memory, Knowledge and Epistemic Competence.” Review of Philosophy and Psychology 2.1 (2011): 89–104.
- Argues that a condition, which Ernest Sosa and others think is necessary for knowledge, rules out knowledge from episodic memory.
Shoemaker, Sydney. “Memory.” The Encyclopedia of Philosophy, Volume 5. Ed. P. Edwards. Macmillan, 1967. 265–274.
- A summary of the philosophy of memory up to the mid-20^th century. Offers a transcendental argument against memory skepticism.
Williamson, Timothy. Knowledge and Its Limits. Oxford University Press, 2000.
- Endorses the epistemic theory of memory and the view that all and only evidence is knowledge.

Author Information

Matthew Frise
Email: matthew_frise@baylor.edu
Baylor University
U. S. A.

Locke: Knowledge of the External World

The problem of how we can know the existence and nature of the world external to our mind is one of the oldest and most difficult in philosophy. The discussion by John Locke (1632-1704) of knowledge of the external world have proved to be some of the most confusing and difficult passages of his entire body of philosophical work. Difficulties develop on several fronts.

First, in his main work in epistemology, An Essay Concerning Human Understanding, Locke seems to adopt a representative theory of perception. According to Locke, the only things we perceive (at least immediately) are ideas. Many of Locke’s readers have wondered, how can we know the world beyond our ideas if we only ever perceive such ideas?

Second, Locke’s epistemology is built around a strict distinction between knowledge and mere probable opinion or belief. Locke appears to define knowledge, however, so as to rule out the possibility of knowledge of the external world. His definition of knowledge as the perception of agreement between ideas has seemed to many of his readers to restrict knowledge to our own thoughts and ideas. Locke himself, however, emphasizes that knowledge of the external world is neither based on inference or reasoning nor is it based on reflecting on ideas somehow already in the mind. Instead, it is achieved through sensory experience. Thus, knowledge of the external world, even as Locke himself describes it, is clearly not a matter of merely knowing facts about our own minds.

Third, many of the special difficulties of understanding how knowledge of the external world is possible stem from what seem to be devastating skeptical arguments against the possibility of such knowledge. Locke’s approach to skepticism, however, has seemed unfocused and possibly in tension with itself. Locke alternately suggests that skepticism cannot be refuted even if we have at least some good reasons to believe it is mistaken, that genuine skepticism is not psychologically possible for human beings, and that skepticism is incoherent.

Ultimately, examining Locke’s discussions around knowledge of the external world can prove one of the most rewarding points of entry into Locke’s theoretical philosophy. Understanding what Locke thinks knowledge of the external world is and how it fits within his broader epistemology and theoretical philosophy requires probing beyond his epistemology and into the depths of his accounts of perception, representation, and the contents of thought. Properly appreciating his position vis-a-vis skepticism likewise leads to issues concerning Locke’s views on the fundamental nature of reality and our limited ability to grasp it. We can know that there is an external world but not much, if anything, about the nature of the world itself.

What is Locke’s Category of Sensitive Knowledge?
Sensitive Knowledge and Locke’s Broader Epistemology
Sensitive Knowledge and Skepticism about the External World
Conclusion
References and Further Reading
1. Primary Texts
2. Secondary Literature

1. What is Locke’s Category of Sensitive Knowledge?

Suppose that you’re waiting with a friend in a hallway to go into a meeting. While furiously making last minute adjustments to the presentation the two of you are about to give, she asks, “My throat is dry, are there any water fountains around?” You look up and down the hallway, see one at the north end of the hallway and reply, “There’s a water fountain over there.” Your friend gets up, walks to the fountain, and takes a drink.

It seems to many people and many philosophers—including John Locke—that when you said, “there’s a water fountain over there,” you expressed some knowledge to your friend. She acted on that knowledge and quenched her thirst. Your helpful statement expressed a paradigmatic instance of knowledge of the external world. According to Locke there are two main questions to ask about any kind of knowledge, including cases like the knowledge of the external world you shared with your friend. First, what do you know? Second, how do you acquire or achieve such knowledge? This section will explore Locke’s answers to the what and the how of knowledge of the external world.

a. The Content of Sensitive Knowledge

For now we will simply suppose that you did have some knowledge of the external world to share with your friend. Section three below will examine Locke’s replies to various skeptical worries to the effect we have no such knowledge. Assuming that you did have some knowledge to share, what exactly did you know and share with your friend? Or, as we might put it in more technical terms, what is the content of your knowledge in this case? More generally, what do we know in cases of knowledge of the external world?

According to Locke, knowledge of the external world is knowledge of ‘real existence.’ Knowledge of real existence is knowledge that something really exists and is not a mere figment of your imagination. Locke argues that we can know three different kinds of things really exist. First, each person can know their own existence at any given time. I can know now that I exist at this time. You can know, as you read this, that you exist while you read this. Locke’s claim here is reminiscent of Descartes’ claim that we know our own existence in every act of thinking—even when we doubt our own existence. Second, Locke believes that we can know that God exists. Locke offers a proof of God’s existence in Book IV, chapter 10 of the Essay. Third, we can know that other things distinct from our minds really exist. When you said to your friend that there was a water fountain over there, the knowledge of real existence you expressed was of this third kind. As you looked at the fountain you knew that there was then something distinct from your mind really existing—the water fountain. That’s not to say this was the only other thing you knew to exist at that time. Presumably you also knew many other things distinct from your mind to exist at that time: the floor you were standing on, the hallway you waited in, the doors in the hallway, etc. The knowledge you shared with your friend, however, concerned the existence of the water fountain. You knew that the water fountain existed distinct from your mind. In general, knowledge of the external world is knowledge of the existence of a thing distinct from one’s mind.

b. How We Come to have Sensitive Knowledge

Locke gives a somewhat unusual name to knowledge of the external world. It is often called ‘sensory knowledge,’ but Locke calls such knowledge ‘sensitive knowledge.’ He uses this phrase to mark the distinct way that we achieve knowledge of the external world. There is something special, according to Locke, about how knowledge of the external world is achieved that sets it apart from how knowledge of other matters, such as mathematical knowledge, is achieved. Knowledge of the external world is known ‘sensitively’—rather than ‘intuitively’ or ‘demonstratively.’ Locke calls these three ways of coming to knowledge the three degrees of knowledge. Before examining what Locke means when he says that knowledge of the external world is achieved sensitively, it is helpful to consider the other ways Locke believes we come to knowledge—the other ‘degrees’ of knowledge.

According to Locke, knowledge of the external world is different than what he calls intuitive knowledge. Intuitive knowledge is knowledge that we grasp immediately and without any need for proof or explanation. For example, anyone who has ideas of the colors white and black and compares those ideas immediately knows that white is not black. This is the kind of knowledge we often have concerning the meanings of words, at least when words are given explicit definition. To use one of Locke’s examples, if ‘gold’ is defined as a yellow metal, then we can know that gold is yellow. In calling knowledge of the external world ‘sensitive knowledge’ Locke is again marking that such knowledge is distinct from intuitive knowledge.

Locke also holds that knowledge of the external world is different than the kind of knowledge we achieve through proofs or argument. When someone proves that the sum of the three interior angles of a triangle is equal to the sum of two right angles through a proof with multiple steps, Locke calls such knowledge demonstrative knowledge. Locke would say that such a person has demonstrated their conclusion. Demonstrative knowledge, for Locke, is knowledge arrived at by what is called a ‘deductive argument’ today. Locke calls knowledge of the external world ‘sensitive knowledge’ to mark that he does not take it to be a kind of demonstrative knowledge. Knowledge of the external world is not arrived at by any such argument or proof.

Knowledge of the external world is not achieved through thinking about the definitions of our terms or comparing ideas that we have already acquired. Knowledge of the external world doesn’t rest on any proof of the external world. Instead, knowledge of the external world is achieved in sensory experience. It is through the entrance of an idea into our mind through the senses that we have knowledge of the external world. Locke writes, “’Tis therefore the actual receiving of ideas from without that gives us notice of the existence of other things and makes us know that something doth exist at that time without us which causes that idea in us…” (E Book IV, chapter 11, section 2). Suppose that the water fountain you saw was newly installed and had a fresh coat of crimson paint. As you looked at the water fountain and light reflected from the fountain to your eyes an idea of that distinct crimson color entered your mind. According to Locke, as the sensation of that color entered your mind you knew that something crimson existed distinct from your mind by its somehow producing that sensation in you.

Your knowledge of the existence of something crimson is therefore acquired in a way distinct from either intuitive or demonstrative knowledge. It does not depend on a proof or on comparing ideas already existing in your mind. Such knowledge is achieved upon looking at the water fountain and the water fountain’s effect on your mind through your senses.

c. The Limitations of Sensitive Knowledge

So far, then, we have seen both the what and the how of knowledge of the external world according to Locke. What we know is real existence. How we know it is through sensation—through the reception of ideas into our minds. The what and the how combine to place some severe limits on what Locke thinks we can know about the external world.

First, our knowledge of the external world only extends as far as current sensory experience. As you look at the water fountain you know that it now exists. When you look away from the water fountain as you turn back to your friend, you no longer know that it now exists. You only now know that it existed when you were looking at it. Similarly, you do not know that it existed before you looked at it. Locke does think that it is highly probable for you that the water fountain existed before and after you look at it. Indeed, he thinks that it is nearly, if not completely, impossible for you to avoid believing that the fountain existed before you saw it and continues to exist after you turn away. Your belief that the water fountain exists when you are not looking at it, then, is both rational and psychologically compelling, according to Locke. Our knowledge extends over relatively little of the world we ordinarily believe to exist. We only know to exist the sensible objects of our immediate sensory environment that are currently affecting us.

Second, we only know the world as it appears to us through our senses. We do not know its underlying nature as it is in itself. This point can be helpfully illustrated by considering a new case. Suppose, for example, that you go on a field trip to gold country. You and the rest of the class dip a sieve into the river and sift out a few flakes of a yellowish metal. The class then goes into a mine, chips off chunks of rock, crush them up, and sift out more pieces of yellowish metal from the crushed stone. At the end of the field trip the class spreads all of the collected pieces of yellowish metal in front of them. As you survey the spread of hunks of yellowish metal you can know that there now exist several distinct objects that affect your mind by producing certain ideas in it—sensations of yellow, solidity, etc. What you do not know is that there is some underlying nature that now exists in each of these hunks of stuff. Moreover, you do not know that they all have the same underlying nature. We are ignorant, in other words, about both the underlying nature of each individual object as well as whether the objects that appear similarly to us have similar underlying natures. There may be tremendous evidence supporting the theory that describes the underlying microstructure of these hunks of stuff and even explains why a microstructure of that type produces the appearances you now see. Such microstructure or underlying nature, however, is not part of how the hunks of stuff now appear to you. Thus, while it may be overwhelmingly probable that some underlying common nature exists in all of the things spread before you, you do not know that that nature exists before you.

One way to make vivid the drastic nature of this limitation on knowledge of the external world is to consider different possible uses of a word like ‘gold.’ If we use the word ‘gold’ to refer to an underlying nature, such as underlying chemical or atomic structure, then on Locke’s view we do not know that gold exists. The belief that gold exists would be a very rational one to hold, based on all of the evidence we have to support our best physical and chemical theories. Nevertheless, such a belief would not be knowledge. If, on the other hand, we use the word ‘gold’ to pick out a category of things which appear to us in a certain distinct way, we may know that gold exists when we experience it. So, for example, if I use ‘gold’ to mean a heavy, yellowish, metallic-feeling thing, then I may know that gold exists when I experience a heavy, yellowish, metallic-feeling thing. Insofar as people use ‘gold’ in the former sense to pick out a chemical or physical kind, rather than in the latter sense to describe a category of thing with a particular sensory appearance, then we do not know that gold exists. In the terminology Locke develops in the Essay, one way to understand this point is that while we can never know that any particular ‘real essence’ exists, we can know that a kind of thing with a certain nominal essence exists.

Third, knowledge of the external world does not extend to other minds. Recall that Locke takes knowledge of the external world to be sensitive knowledge. Sensitive knowledge is achieved as a result of things operating on us through our senses. Locke does not think that other minds affect us directly through our senses. (Our own mind produces ideas in us through what Locke calls reflection, a kind of inner sense directed at our own mind.) At best, the minds of other creatures, including other human beings and other people, affect the behavior of such creatures’ bodies. Those bodies then affect our minds through our senses. As a result, no other minds directly produce ideas in our minds through our senses. Locke does think it overwhelmingly probable, given the similarity of the behavior of other human beings to one’s own behavior, that other human beings, at least, have minds (see 4.11.12). Moreover, believing that other human beings (or even other ‘lower’ animals) have minds may be psychologically irresistible for us (that is, solipsism may not be a real psychological option for us). So, as in the case of believing that objects continue to exist when we don’t experience them, Locke sees belief in other minds as both rationally and psychologically compelling but he does not see it as knowledge.

Overall, then, we can sum up Locke’s account of the how and the what of knowledge of the external world as follows:

What: in particular instances of knowledge of the external world we know the existence of a thing external to our mind. When you saw the water fountain, for example, you knew that a crimson thing, that is a thing with a power to produce a certain sensation in you, then existed.

How: in particular instances of knowledge of the external world we know the existence of a thing with various powers to affect our mind by producing ideas in our mind by virtue of our awareness of the entrance of those ideas into our mind. When you saw the water fountain, for example, you knew that a thing produced a certain visual idea in your mind at that time; that a crimson sensation was then entering your mind.

2. Sensitive Knowledge and Locke’s Broader Epistemology

In section 1 we explored what sensitive knowledge is: what do we know? how do we know it? what are some of the—perhaps surprising—limitations Locke places on sensitive knowledge? This section will explore what has seemed to many to be one of the most puzzling aspects of Locke’s discussion of sensitive knowledge—its compatibility with Locke’s own definition of knowledge. This is a question of how to integrate Locke’s discussion of sensitive knowledge with his broader epistemology. There is a large range of opinion among Locke scholars on whether Locke’s definition of knowledge is compatible with sensitive knowledge, but until very recently most of it was overwhelming pessimistic. Most of Locke’s readers have thought that sensitive knowledge can’t fit under Locke’s official definition of knowledge and is incompatible with his broader epistemology. More recently, however, Locke scholars have attempted to explain how sensitive knowledge can be explained in the terms of Locke’s official definition of knowledge.

After introducing Locke’s definition of knowledge and laying out its prima facie incompatibility with sensitive knowledge, this section will briefly explain various attempts that have been made to integrate sensitive knowledge with Locke’s epistemology.

a. Locke’s Definition of Knowledge

The final Book of the Essay is dedicated to knowledge and opinion. Locke begins Book IV with a definition of knowledge. To appreciate the potential tension between the definition of knowledge and sensitive knowledge it is worth quoting the definition at length. Locke writes:

Knowledge then seems to me to be nothing but the perception of the connection and agreement, or disagreement and repugnancy of any of our ideas. In this alone it consists. Where this perception is, there is knowledge, and where it is not, there, though we may fancy, guess, or believe, yet we always come short of knowledge. E IV.i.2 (emphases original)

Locke and his readers frequently shorten this definition of knowledge by calling knowledge the perception of agreement of ideas. This entry will adopt that convention.

There are important questions about Locke’s definition of knowledge that bear on its compatibility with sensitive knowledge. Foremost is how to resolve an ambiguity in the definition. There are two ways to read the second ‘of’ in ‘the perception of agreement of ideas.’ First, one could read it as saying that knowledge is the perception of agreement of ideas with something or other, not necessarily another idea. Second, one may read the definition as stating that knowledge is the perception of agreement between ideas—the perception of agreement of one idea with another idea. As we will see below in section 2.7, one route to resolving the tension between sensitive knowledge and Locke’s definition of knowledge is to adopt the first interpretation of the definition. Most of Locke’s readers, however, have rejected this option. In the margin next to the paragraph following the definition of knowledge, Locke noted in his personal copy of the Essay that knowledge is the perception of agreement between two ideas. Following this lead, nearly all of Locke’s readers have taken the second reading, that Locke defines knowledge as the perception of agreement between ideas.

Having fixed an interpretation of Locke’s definition of knowledge, we can now turn to bringing out the tension between Locke’s definition of knowledge and sensitive knowledge. To begin, one might wonder: what does an agreement between two ideas tell us about what exists beyond those ideas? Knowledge of the external world, according to Locke, is knowledge of the existence of something distinct from our mind (and so, of course, distinct from the ideas in our mind). Even Locke himself notes that the mere existence of an idea of something does not guarantee the existence of what that idea is an idea of. Merely having an idea of a freshly painted crimson water fountain does not guarantee that a freshly painted crimson water fountain really exists. At this point, if there is to be any hope, we ought to take a step back and ask: what are the two ideas that agree in sensitive knowledge? It seems clear that if I know the crimson water fountain exists, my idea of it will be one of the ideas. What is the second idea?

We might start making progress on this question by considering the content of sensitive knowledge. As detailed in section one above, we know that a thing exists distinct from our mind. For example, when you saw the freshly painted crimson water fountain down the hall, you knew that a crimson thing really exists. Perhaps, then, sensitive knowledge involves the perception of agreement between the idea of a thing and the idea of real existence. When you look down the hall and know the water fountain exists you perceive an agreement between your idea of the crimson water fountain and the idea of real existence.

One difficult question facing this view is that it’s not clear how to make sense of the idea of real existence agreeing with the idea of anything (except, perhaps, the idea of God). The problem here can be made vivid by adopting a particular understanding of what it is for ideas to agree. On a popular way of interpreting Locke’s account of knowledge, perceiving an agreement between ideas is perceiving some sort of connection between the ideas. In proving, for example, that the sum of the interior angles of a triangle is equal to the sum of two right angles, one perceives through a series of steps that the ideas are connected by the relation of equality. But what would the connection between the idea of real existence and the idea of a thing, such as your idea of the freshly painted crimson water fountain, be? It certainly can’t be that the idea of the freshly painted crimson water fountain entails the idea of real existence since it isn’t necessary that the water fountain exists. Again, contrast sensitive knowledge with intuitive knowledge of the meaning of a term. If ‘gold’ is defined as a yellow metal, then, the idea of yellow is entailed by the idea of gold; it is contained within it. Any thing that is a yellow metal is yellow. But in the case of my idea of the crimson water fountain it’s not true that anything that is a crimson water fountain really exists. The crimson water fountain between the houses of Santa Claus and the Easter Bunny, for example, doesn’t really exist. What, then, is the connection between the ideas perceived to agree in sensitive knowledge and how is such a connection perceived through sensory experience?

We might try to sum up the problem facing Locke’s account as follows. Locke’s definition of knowledge appears to make all knowledge a priori. That is, it seems to make all knowledge depend on reflecting and comparing our ideas to one another in an attempt to understand relations between our ideas. But knowledge of the external world is patently not a priori. What (contingently, at least) exists in the world can’t be known to exist merely by reflecting on our own thoughts. In the remainder of this section, we’ll explore various approaches to the question of whether and how Locke’s definition of knowledge can accommodate sensitive knowledge. As we will see, the question of how to integrate sensitive knowledge with Locke’s account of knowledge brings us to consider many central aspects of Locke’s theoretical philosophy beyond his epistemology.

b. Sensitive Knowledge as Incompatible with Locke’s Definition of Knowledge

One tradition that stretches back to Locke’s first readers is simply that Locke bungled his epistemology. One of Locke’s most public critics was Edward Stillingfleet, Bishop of Worcester. Locke and Stillingfleet corresponded in a series of public letters. One of the very first criticisms Stillingfleet leveled at Locke was that his definition of knowledge in terms of ideas makes knowledge of the real world, including even knowledge of its existence, impossible. This criticism persisted even into the twentieth century. Locke, such readers maintain, makes all knowledge a priori. Knowledge of the external world is not a priori. Therefore, Locke’s definition makes knowledge of the external world impossible. Locke’s repeated insistence that we do have sensitive knowledge despite its incompatibility with his definition, such readers maintain, is a result either of his failure to recognize the problem or of a dogmatic insistence that we have such knowledge. The former option is not particularly plausible in light of Locke’s correspondence with Stillingfleet. In fact, Locke responds to Stillingfleet’s charge by describing the ideas perceived to agree in sensitive knowledge. We will shortly have a chance to consider Locke’s response in section 2.4 below. For now it is enough to recognize that Locke surely did not simply miss the apparent problem. That leaves us with the second option. Locke, on this view, brought out a tension with excruciating clarity but was not able to resolve it and instead merely wallowed in it clinging to both sources of the tension.

Though historical figures are as prone to error and clinging to positions they cannot adequately defend as any of us, it is generally best to explain such error or dogmatic clinging rather than simply leave it as unexplained brute failure. Those who think that Locke simply crashed headlong into the tension between knowledge of the external world and his definition of knowledge without offering much in the way of resolution often explain Locke’s position as a result of his particular period in history. Locke, on these views, found himself caught between the expanding and improving new science, and its mechanistic world view, on the one hand, and an old epistemological paradigm with its emphasis on certainty, on the other. The tension between Locke’s claims about sensitive knowledge and his definition of knowledge reflects this broader tension at large during Locke’s life between the changing shape and power of empirical inquiry and attitudes about knowledge.

c. Sensitive Knowledge and Locke’s Theory of Representation

A second approach to making sense of Locke’s claim that we have sensitive knowledge despite its apparent tension with his definition of knowledge attempts to find resolution in Locke’s philosophy of mind. The basic aim of this approach is to show how sensitive knowledge fits with the broader spirit of Locke’s philosophy even if it runs against the letter of his epistemology. Locke, on these views, supplements his official definition of knowledge with a tacit reliabilism about knowledge when it comes to knowledge of the external world. The groundwork for Locke’s reliabilism is to be found in his account of the meaning of a special kind of idea. To appreciate this approach it will help to take a step back and consider in some detail Locke’s account of how the mind comes to acquire its ideas.

Locke’s aim in Book II of the Essay is to demonstrate how all of our ideas can be acquired through experience. To this end, Locke divides ideas into simple and complex ideas. Simple ideas are passively received by the mind and have no other ideas as parts. So, for example, when I bite into a pineapple I might receive several different simple ideas. One such idea would be the taste of the pineapple. Another might be the feeling of solidity or resistance as I bite into it. Yet another might be the particular wet, slippery texture of the fruit in my mouth, etc. After I’m done chewing it, I might notice a particular sticky texture left on my fingers where I held the fruit. The taste, the various textures, the different shades of yellow, all are different simple ideas for Locke. More specifically, these are all simple ideas of sensation; simple ideas produced in the mind by things outside of the mind operating on it through the senses. Locke also holds that we have simple ideas of reflection. Simple ideas of reflection are ideas of the mind’s own operations. They are ideas produced in the mind when those operations are active. Reflection, Locke thinks, is like our outer senses but directed at the mind’s own activity rather than at an external world. All of these simple ideas of reflection and sensation are passively received by the mind.

Complex ideas are ideas produced by the mind operating on ideas that are somehow already in the mind, whether simple or complex. One way to form complex ideas is by putting two ideas together. One might, for example, combine the visual appearance of a banana with the taste of a pineapple in imagining a ‘pineana.’ Or one might compare a fruit fly crawling on a pineapple to the pineapple itself to form the idea of the larger than relation. Or, one might combine ideas of certain bodily movements corresponding to certain forms of music to make the idea of a dance. All of these would be complex ideas. The operations Locke most frequently discusses are operations of combining ideas, comparing ideas, and abstracting ideas.

The central thrust of Locke’s account of the origins of our ideas is that given a certain set of simple ideas and a certain set of mental operations we can explain how we get all of the ideas we have. Sensation, reflection, and operations of the mind can explain all of the ideas human beings have according to Locke. That is, all of the contents of our thoughts can be traced to origins in sensation or reflection and some combination of mental operations.

Locke’s category of simple ideas is relevant to sensitive knowledge because it occupies a special place within his broader theory of ideas. Simple ideas of sensation are unique among all ideas in that they both represent the external world as well as represent their object perfectly. Some of Locke’s readers have concluded that this unique place in Locke’s theory of ideas makes simple ideas of sensation ripe for use in understanding Locke’s claims about sensitive knowledge.

We can consider each of these features of simple ideas—that they represent external reality and that they represent it perfectly well—in comparison to other ideas. Simple ideas are produced in our minds by other things operating on us. As a result, Locke claims, they represent the power to produce those ideas—that is, the object a simple idea perfectly represents is the power to produce that idea. Simple ideas are not the only ideas that represent mind-independent reality, however. Our ideas of things, whether particular individuals or kinds of things, also represent mind-independent reality. Locke calls this type of idea ideas of substances and they are complex ideas. For example, my idea of a particular individual horse, Mr. Ed, is an idea of a substance. It is an idea of a particular thing which has various qualities. Ideas of substances are therefore ideas that represent (or at least purport to represent) extra-mental reality.

We have other ideas besides simple ideas and ideas of substances, however. We also have ideas of relations and modes. For reasons that go beyond the scope of this entry, Locke does not take our ideas of either kind to represent mind-independent reality. Simple ideas and ideas of substances alone among all ideas represent the external world.

Though simple and substance ideas are alike in representing the external world, they differ with respect to how well they represent the world. Only simple ideas, according to Locke, represent the external world perfectly. Whether an idea of a particular individual substance (Mr. Ed) or an idea of a kind of substance (horses), our ideas of substances all fail to some degree in representing what they aim to represent. To see this difference, we can first consider why simple ideas represent their object perfectly. According to Locke, simple ideas represent the power to produce those ideas in us. That is all they represent. Ideas of substances, by contrast, purport to represent an individual or kind of individual. To do so requires representing that individual or kind as having all and only the qualities it in fact has. If my idea of Mr. Ed does not include an idea of the color of his eyes, then my idea of Mr. Ed falls short of representing Mr. Ed as he really is. It is not, in Locke’s terms, an adequate idea. Similarly, if my idea of Mr. Ed represents him as having a dark spot above his tail, but Mr. Ed does not have such a spot, my idea is again an inadequate idea. Now, to have an adequate idea of a particular substance or kind of substance would be to represent not only all of its sensible qualities—that is, ideas of all of the ways in which it can affect our senses—but also to have ideas of all of its abilities to affect other things. It seems clear—at least in Locke’s mind—that no amount of humanly possible experimentation could ever reveal that degree of detail. At the very least, we simply cannot bring any given thing to interact with all of the other things in the universe to understand the effects it may have on them or them on it.

Simple ideas of sensation, then, stand alone as ideas that both represent the external world and perfectly represent it. For that reason it has seemed to some that simple ideas of sensation are fit to explain sensitive knowledge. Namely, Locke can combine this externalism about content with an externalism about knowledge. Simple ideas have external content in the sense that they represent their cause. Such ideas are fit for knowledge of the external world because inferences from effects to causes is of sufficient reliability to count as knowledge.

Even if one grants this interpretation of the external content of simple ideas, there are different ways of filling in the details. How the details of this content are filled in, moreover, has implications for the content of sensitive knowledge.

One possibility is that simple ideas are what M.R. Ayers calls ‘blank effects.’ The presence of a particular simple idea in your mind on a specific occasion indicates nothing but a power to produce that idea in you at that time. The causes of that idea across different occasions may have nothing in common, and may bear no similarity to one another outside of their ability to produce that idea in your mind.

A second possibility is that simple ideas represent something like their normal or designated cause. The cause may be ‘usual’ or ‘normal’ in any number of senses. It may be the cause that God has ordained for an idea. It may be the cause that most often produces the idea. It may be the cause that was naturally designed to produce the idea. Which of these readings a proponent of this interpretation adopts is not especially important for the purposes of this entry. What is important is that what is meant by the power to produce an idea in this sense is a particular kind of structure in the world.

To illustrate the difference between these interpretations consider the following comparison. Take a particular sweet taste—say, the taste of the glaze on a donut—and a particular non-sweet taste—say, the taste of Tabasco hot sauce. Now consider the effects of the so-called Miracle Berry. If one eats a miracle berry, tabasco sauce will taste like donut glaze and donut glaze will taste like Tabasco sauce. Consider this sequence.

T0: Taste some Tabasco sauce producing the simple idea of the taste of Tabasco in one’s mind.

T1: Eat a miracle berry

T2: Taste some Tabasco sauce producing the simple idea of the taste of donut glaze in one’s mind.

On blank effect readings, powers to produce simple ideas are fully perceiver-relative entities.As a result, the Tabasco sauce has two different powers at T0 and T2. At T0 it has the power to produce the idea of the taste of Tabasco. At T2, because of the effects of the miracle berry, it now has the power to produce the idea of the taste of donut glaze and no longer has the power to produce the idea of the taste of Tabasco.

By contrast, one may think that Locke’s simple ideas have some stronger, more external content. On this stronger reading the ‘power to produce an idea’ is something like the chemical structure that is the usual cause of a certain idea. On this reading, the Tabasco sauce has the same powers at T0 and T2 because it has the same chemical structure and would have the same effect on a normal perceiver.

Recent Locke scholars such as MR Ayers and Martha Bolton have paired externalism about the content of simple ideas with externalism about the knowledge such ideas allow for. On the blank effects reading, if you judge that the cause of a simple idea exists on the basis of my having that simple idea, you cannot fail to be wrong. Such judgments are perfectly reliable and therefore ought to be regarded as knowledge. If you are blindfolded, unknowingly ingest a miracle berry, sample some Tobasco sauce and then judge that you have tasted some donut glaze you are, in a sense, correct and have sensitive knowledge. You have tasted something with the power to produce the simple idea of the taste of donut glaze in you. That is all, on this view, the knowledge of the external world we have: there exist certain powers to affect our minds by producing ideas in us. On this reading we only know the world in relation to ourselves.

On the stronger, more external readings, if you judge that the cause of a simple idea exists on the basis of having that simple idea, you are normally right. According to these views, however that ‘normally’ is cashed out, it will be good enough for such beliefs to amount to knowledge even if they aren’t perfectly reliable because we can be in unusual perceptual circumstances. If you are blindfolded, unknowingly ingest a miracle berry, sample some Tobasco sauce, and then judge that you tasted some donut glaze you are wrong and do not have sensitive knowledge. You have not sampled the usual cause of that idea. When you do, however, taste some actual donut glaze and on that basis judge that there is something with the power to produce the idea of the taste of donut glaze in you, you are right and do have knowledge. This reading of Locke makes his view more similar to that of contemporary externalist epistemologies which deny that having knowledge entails that one knows that one has knowledge (the so-called KK principle). The blank-effects reading, by contrast, remains compatible with knowing that one knows.

Understanding sensitive knowledge in light of his semantics for simple ideas does not ultimately reconcile sensitive knowledge with Locke’s definition of knowledge. Rather, doing so highlights how Locke has resources from his philosophy of mind and its account of the content of thought to supplement his official definition of knowledge with a kind of reliabilism about knowledge. Approaches under this umbrella diverge in how reliable they take such judgments about the existence of the cause to be, where the reliability depends on the external content of Locke’s simple ideas.

d. Simple Ideas of Reflection and Cognitive Faculty Indicators

Some Locke scholars have attempted to reconcile Locke’s definition of knowledge with sensitive knowledge. They attempt to make sense of sensitive knowledge as the perception of agreement between ideas by finding a connection between the idea of real existence and the idea of a sensible object, such as the water fountain from section one. Interpretations developed by Newman, Allen, and Nagel attempt to draw this connection through an idea of reflection.

To understand this approach, it will be helpful to consider a part of Locke’s theory of ideas only briefly mentioned in 2.3. Recall that we receive simple ideas through two channels according to Locke’s theory of ideas: sensation and reflection. Simple ideas of sensation are produced by objects external to our mind operating on us through our senses. Ideas of reflection, by contrast, are received into the mind by a kind of inner sense—the mind’s awareness of its own activities. The aforementioned interpreters claim that ideas of reflection function as a kind of cognitive faculty indicator analogous to something like a time stamp on a video or photograph. Recording devices frequently time stamp what they record. That is, the recording produced by the device includes information about the time it was recorded. These interpretations attribute a similar view to Locke when it comes to the mental faculty by which an idea comes to be in the mind. The mind, in being aware of its activities, stamps any given idea with an idea of the faculty by which the former is produced in the mind on that occasion. This cognitive faculty indicator provides the connection between the idea of the sensible object and the idea of real existence.

According to Locke, a sensory experience of the sun is manifestly different from a memory of the sun. In fact, Locke claims that a sensory experience of the sun is as distinct from a memory of the sun as it is from a sensory experience or memory of the moon. According to those like Allen, Nagel, and Newman, Locke explains this difference as a matter of each way of thinking about the sun involving distinct ideas of reflection. Looking at the sun in the middle of a cloudless day, the idea of the sun is ‘stamped’ with the idea of actual sensation. The idea of actual sensation is an idea of reflection; an idea of the mental faculty responsible for producing the idea of the sun in the mind at that time. Later that night when remembering how the sun looked at midday, an idea of the sun is again in the mind but this time it is stamped with the idea of memory. The idea of memory is likewise an idea of reflection; an idea of the mental faculty active in producing the idea of the sun in my mind at this later time.

According to this line of interpretation, there are three ideas involved in any given instance of sensitive knowledge. First, there is the idea of the sensible object—the idea of the sun or your idea of the water fountain. Second, there is the idea of sensation. This is an idea of reflection. Third, there is the idea of real existence. The idea of sensation functions as an intermediary connecting the idea of the sensible object to the idea of real existence. The connection between the idea of sensation and the idea of real existence is supposed to be the kind of a priori connection involved in intuitive and demonstrative knowledge. If you are having a sensation then the cause of that sensation exists outside of your mind. Sensation just is being affected by the external world. Given that an idea is stamped with the reflective idea of sensation, then we can safely infer that the cause of the idea so-stamped exists outside of our mind. The connection between the idea of sensation and the idea of the sensible object is not like this—and it is not clear exactly what this relation is according to Locke (possibly co-occurrnce in the mind or some special mode of binding). The important point to note is just that the agreement between the idea of sensation and the idea of real existence is a different kind of agreement than that between the idea of sensation and the idea of the sensible object.

Interpreters disagree on what to make of this difference in the relation between the three ideas involved in sensitive knowledge. Newman suggests that the relation between the idea of actual sensation and the idea of the sensible object (the idea of the sun) only yields a probable opinion and not strict knowledge. Newman emphasizes that the involvement of probable opinion as a component of sensitive knowledge explains Locke’s claims that sensitive knowledge is the least certain of all forms of knowledge. Nagel and Allen, by contrast, hold that both the relation between the idea of actual sensation and the idea of the sensible object as well as the connection between the idea of actual sensation and the idea of real existence are knowledge conferring connections.

The textual motivation for these views comes from Locke’s exchange with Stillingfleet. In section 2.2 above, we saw that Stillingfleet pressed Locke on whether his account of knowledge could handle knowledge of the existence of the external world. Locke responded by describing the ideas perceived to agree in sensitive knowledge. It is worth considering the complete passage:

Now the two ideas that in this case are perceived to agree and do thereby produce knowledge are the idea of actual sensation (which is an action whereof I have a clear and distinct idea) and the idea of actual existence of something without me that causes that sensation. The Works of John Locke, vol. 4, p. 360.

According to these views, when Locke says that one of the ideas perceived to agree in sensitive knowledge is ‘the idea of actual sensation,’ he is naming an idea of reflection, an idea of an operation of the mind. The phrase as it appears in the passage, however, is ambiguous. Locke may be saying that one of the ideas perceived to agree in sensitive knowledge is a sensation—in Locke’s official terminology of the Essay, a simple idea received through sensation—rather than an idea of a certain operation of the mind. Indeed, Locke seems to refer back to this idea as a sensation rather than as an idea of reflection when naming the second idea perceived to agree in sensitive knowledge. He calls it an idea of something that causes ‘that sensation.’ An idea of reflection such as the idea of sensation or the idea of memory is not a sensation. Proponents and opponents of the simple idea of reflection approach give this passage and other similar passages from the Essay and Locke’s correspondence much attention.

In addition to textual worries, one might have philosophical worries about understanding sensitive knowledge as dependent on the reflective idea of sensation. Namely, it might seem to leave Locke open to obvious skeptical objections. On what grounds should we trust our cognitive faculty indicator? Just as one might doubt that a sensory idea really is produced by something external to our minds, one might worry that our ideas of reflection do not accurately track which mental faculties were responsible for producing an idea in our mind. This kind of skeptical doubt, however, is separate from the attempt to sketch how Locke’s definition of knowledge can fit with sensitive knowledge. After all, one might doubt demonstrative knowledge or intuitive knowledge as well. We will return to Locke’s replies to skepticism in section three below.

e. Sensitive Knowledge as Assurance rather than Strict Knowledge

Sam Rickless has recently advanced what he calls the assurance view of sensitive knowledge. Like the approaches discussed in 2.2 and 2.3, Rickless does not think that sensitive knowledge can be reconciled with Locke’s account of knowledge. However, Rickless argues that Locke himself did not think that sensitive knowledge was, strictly speaking, knowledge after all. As illustrated in 1.2, part of the point of Locke’s discussion of sensitive knowledge is to mark it off as distinct from other forms of knowledge. The philosophical motivation for the assurance approach lies in taking Locke’s definition of knowledge to give knowledge an a priori nature. It simply runs contrary to such a definition that we might know the existence of a contingent, finite object distinct from our minds.

The textual basis of the assurance approach lies in some of the key phrases Locke uses to describe sensitive knowledge. Locke calls sensitive knowledge a kind of ‘assurance.’ ‘Assurance’ is a term that Locke later uses in Book IV of the Essay as a name for mere probable opinion that falls short of knowledge. Similarly, Locke says that sensitive knowledge ‘passes under’ the name of knowledge rather than straightforwardly calling it knowledge. Finally, as noted above, Locke believes that sensitive knowledge is less certain that intuitive or demonstrative knowledge. It seems difficult to understand how sensitive knowledge could be less certain but nevertheless knowledge. Rickless suggests that we can make sense of the lesser certainty of sensitive knowledge by recognizing that it is not knowledge, strictly speaking, at all.

f. Analyzing Knowledge rather than Defining its Subject Matter

Another approach of note developed during the late twentieth century in the work of Ruth Mattern and then David Soles. Mattern and Soles attempt to reconcile sensitive knowledge with Locke’s definition of knowledge by developing the claim that Locke’s definition of knowledge is merely an analysis of knowledge rather than a description of the subject matter of knowledge. In other words, when Locke defines knowledge as the perception of agreement between ideas he is not claiming that knowledge is about ideas or relations between ideas. Rather, Locke’s definition of knowledge expresses what we do when we achieve knowledge about whatever subject matter we’re interested in: the existence of a thing, the relation between two mathematical objects, etc. Knowledge is grasping the truth of a proposition, seeing that a proposition is true. Locke’s definition of knowledge merely says the same using the Essay’s favored terminology of ideas.

An important consequence of this view is that it pushes back against the claim that all knowledge is of an a priori nature for Locke. His definition in and of itself merely says that knowledge is grasping the truth of a proposition. There may be many ways to ‘grasp’ or perceive the truth of a proposition that do not involve merely thinking about our own ideas. In other words, Locke’s definition leaves open the scope of our knowledge, the ways in which we can perceive any given truth. We might perceive the truth of some propositions using a priori methods, as happens in mathematics. However, there might be other ways of perceiving the truth of a proposition and so coming to knowledge.

Though both Mattern and Soles emphasize this consequence of their view, neither develops in detail how Locke might think we perceive the truth of the kinds of existential propositions known in sensitive knowledge. What sets their approach apart from those mentioned thus far, however, is that rather than try to fit sensitive knowledge to a more widely accepted understanding of Locke’s definition of knowledge, Mattern and Soles take the root of the problem of incorporating sensitive knowledge within Locke’s epistemology to lie in a widespread misunderstanding of Locke’s definition of knowledge.

g. Sensitive Knowledge and Direct Perception

Finally, John Yolton pioneered an approach to the problem of incorporating sensitive knowledge within Locke’s epistemology based on his larger project of developing an interpretation of Locke’s entire Essay as offering a direct realist theory of perception. At the core of Yolton’s view is a radical departure from Locke scholarship regarding the nature of Locke’s ideas. Ideas, according to Yolton, are acts rather than objects. On Yolton’s view, sensitive knowledge just is perceiving the agreement of an idea with a thing itself. It therefore trades on an interpretation of Locke’s definition of knowledge which is out of favor within current Locke scholarship, as noted above. Namely, that Locke’s definition of knowledge treats knowledge as the perception of agreement between an idea and some thing, not necessarily another idea. With his direct perception interpretation in the background, Yolton is positioned to say that sensitive knowledge can be a perception of agreement between an idea and a really existing thing itself. Yolton’s direct perception interpretation—if not his reading of Locke’s definition of knowledge—has been developed and defended in recent work by Tom Lennon, which will be noted in the annotated bibliography below.

3. Sensitive Knowledge and Skepticism about the External World

Section 1 explored what Locke takes knowledge of the external world to be, its content and the means by which it is achieved. Section 2 focused on the relationship between Locke’s discussion of knowledge of the external world and his broader epistemology. Knowledge of the external world, however, is often best known for its perplexing relationship with skepticism. This section will explore Locke’s attitude towards and arguments against skepticism.

Locke himself is well aware of skeptical worries about the external world. Each time he brings up sensitive knowledge in the Essay, he follows his introduction of the topic with a discussion of skeptical worries. This section will explore three threads in Locke’s response to the skeptic. First, we will consider what Locke calls ‘concurring reasons.’ These are reasons that Locke takes to support sensitive knowledge, though it appears that he does not think any ordinary instances of sensitive knowledge are based on these reasons. Second, Locke believes that sensitive knowledge is not susceptible to practical doubt. Even if one says that one doubts that the external world exists, sensory experience unfailingly guides human actions. That is, no one can act as if they doubted what their senses tell them about the external world. Third, Locke seems to think that the skeptic, at least in her stronger forms, is self-undermining.

a. The Concurrent Reasons with Sensitive Knowledge

Locke notes that in addition to knowing the existence of a thing when we see it, we have four ‘concurrent reasons’ that further support sensitive knowledge. Some of these reasons commonly crop up in discussions of skepticism in the early modern period from Descartes to Hume.

The first reason that Locke offers is that sensations depend on having senses. People without the requisite sensory organs fail to have the relevant ideas. Merely having the organs isn’t sufficient for having the ideas—a person with eyes sees no colors in the dark. So, it would seem that an external object to the senses is necessary for sensations.

To a skeptic, this is not likely to be especially convincing. After all, the skeptic doubts the very basis of claims that we have sensory organs or that sensory organs themselves are not sufficient for sensations—sense-based observations. Locke’s point here presupposes the veracity of observations of sensory organs and instances of failing to have certain ideas in certain external world conditions.

The second reason Locke offers as concurrent with sensitive knowledge is that sensations are manifestly different than other forms of thought, such as memory or imagination. As we saw above in section 2.4, Locke takes a memory of the sun to be as different from a sensory experience of the sun as a memory of the moon. One way that Locke makes this point vivid concerns our passivity in sensory experience. We can neither produce a sensory experience at will nor prevent ourselves from having a sensory experience at will. When you look up the hall with open eyes it is not up to you whether you see a crimson water fountain. Your mind is simply acted upon. By contrast, we do often exercise voluntary control over memories. We recall previous thoughts and experiences and create new things in thought at will.

A skeptic could, of course, question the force of this reason. The skeptic may point out that we could be passive in sensory experience in our dreams and hallucinations, or because we are disembodied brains in vats. Indeed, the skeptic may insist, we may be wholly non-physical minds subject to the whims of a malicious demon. Nevertheless, even if our passivity with respect to sensation doesn’t prove that the external world exists, Locke may offer it as at least a point that can be built on as part of an argument that the best explanation of our sensory experience is an external world.

The third concurrent reason Locke offers concerns the special connection between sensory experience and pleasure and pain. Locke points out that pleasure and pain are uniquely connected to sensory experience. Remembering the warmth of the sun doesn’t bring the same pleasure as basking in it. Remembering the burn of the fire doesn’t bring the same pain as did reaching in to save a child’s cherished toy accidentally flung into the flames. The value of this reply and its more precise argument against the skeptic will be explored below in section 3.2.

The final, and fourth concurrent reason Locke offers is a very familiar one. Our senses, Locke points out, tend to confirm and mutually support one another. We can touch what we see to verify that what we see really exists. Again, this sort of consideration is not on its own decisive against a skeptic. After all, a malicious demon could arrange the same sort of consistency. However, this kind of consideration can be regarded as a concurrent reason to our sensitive knowledge insofar as the mutual support of our senses is a point that can be part of a larger case in favor of the existence of an external world. Perhaps the best explanation—if not the only possible explanation—of both our passivity and the coherence of our sensations is that an external world is the cause of them.

One of the most interesting aspects of Locke’s concurrent reasons, however, is that they are offered by Locke as reasons supporting the truth of the content sensitive knowledge. That raises a question about sensitive knowledge itself. Does Locke think that instances of sensitive knowledge themselves rest on any reasons? Do we infer the existence of some thing distinct from our minds on the basis of some premise concerning the ideas we have at the time? If Locke does think that sensitive knowledge is based on some reasons, he never clearly articulates what those reasons are or how they are acquired. Perhaps, then, sensitive knowledge is non-inferential and not based on any reasons. Shelley Weinberg has developed an account of sensitive knowledge as non-inferential. Indeed, a non-inferential view of sensitive knowledge seems to fit neatly with the contrast observed in section one above which Locke draws between sensitive and demonstrative knowledge. Demonstrative knowledge, recall, is knowledge achieved by reasoning from premises.

A consequence of taking sensitive knowledge to be non-inferential is that the skeptic cannot be proven wrong—we cannot prove the existence of an external world even if we know it to exist in sensitive knowledge. These concurrent reasons at best make it probable that the external world exists. The concurrent reasons Locke offers, then, are not intended to provide a decisive defeat of the skeptic as part of a proof of the external world. Instead, they provide what Locke takes to be the strongest rational support possible.

b. Skepticism and Practical Doubt

In addition to emphasizing the special connection between sensory experience, on the one hand, and pleasure and pain, on the other, Locke repeatedly remarks that skepticism can be cured by fire. Locke writes:

For he that sees a candle burning, and hath experimented the force of its flame by putting his finger in it will little doubt that this is something existing without him which does him harm and puts him to great pain: which is assurance enough when no man requires greater certainty to govern his actions by that what is as certain as his actions themselves. And if our dreamer pleases to try, whether the glowing heat of a glass furnace be barely a wandering imagination in a drowsy man’s fancy by putting his hand into it he may perhaps be wakened into a certainty greater than he could wish that it is something more than bare imagination. E IV.xi.8.

Locke’s point in this and similar passages seems to be that the deliverances of our sense are connected with pleasure and pain in such a way as to make it impossible to doubt our senses for the purposes of guiding our actions. A skeptic, for example, may deny that the glass furnace exists, but if she sticks her hand into the furnace she will irresistibly act on the deliverances of her senses. She will move her hand away from where she perceives the furnace to be, betraying that she in fact accepts what her senses tell her about the world. For the purposes of guiding her action, then, even the skeptic takes the deliverances of her senses to be real.

How strong this serves as a rejoinder to the skeptic is not immediately clear. The skeptic may reply that though they are compelled to act in certain cases this doesn’t mean that they genuinely accept the deliverances of the senses. Or, perhaps more strongly, the skeptic may reply that though they are compelled to assent to what the senses convey, such assent is not rational or reasonable. It is more like a reflex than an action.

Jennifer Nagel has argued that Locke anticipates this kind of response from the skeptic. Locke, according to Nagel, argues that all it is to treat something as really existing is to treat it as action guiding. Locke, in other words, might be taken to collapse the distinction between real existence and real for practical purposes of guiding our action with respect to pleasure and pain. This move by Locke taps into one of Locke’s earliest diagnoses of skepticism: it is rooted in an excessive demand on our rational faculties that derives from insufficient appreciation of the purpose of our faculties.

The purpose of our cognitive faculties, Locke suggests in the Essay’s introduction, is to secure happiness both in this world and beyond. Insofar as our senses provide a guide to securing pleasures and avoiding pains, the senses fulfill their purposes and achieve all the knowledge we can reasonably hope for. A skeptic, then, who hopes for more knowledge over and above guidance with respect to pleasure and pain simply demands too much. Even the skeptic can’t practically deny that our senses do give us knowledge of how to guide our actions with respect to pleasure and pain. That is all there is to knowledge of real existence.

Ultimately, a reply to skepticism based on collapsing real existence with action guidance is only as strong as that collapse. Any form of skepticism that takes knowledge of real existence to be more than knowledge of how to pursue pleasure and avoid pain will remain unmoved. Locke’s view could be more convincing if it were accompanied by a defense of his views about the purpose of our cognitive faculties.

c. Skepticism as Self-Undermining

A final line of response to skepticism can be found in Locke’s discussion of sensitive knowledge. When Locke mentions skeptical worries he tends to dismiss them as unworthy of—or possibly as themselves ruling out—serious response. Here are two examples:

If anyone say a dream may do the same thing and all these ideas may be produced in us without any external objects he may please to dream that I make him this answer, 1. that ’tis no great matter whether I remove his scruple or no: where all is but dream, reasoning and arguments are of no use, truth and knowledge nothing… E IV.ii.14

For I think nobody can, in earnest, be so skeptical, as to be uncertain of the existence of those things which he sees and feels. At least, he that can doubt so far, (whatever he may have with his own Thoughts) will never have any controversy with me; since he can never be sure I say anything contrary to his opinion. E IV.xi.3

Locke seems to suggest in these passages that the skeptic is in some way self-undermining. In raising the possibilities they do, they somehow undermine the ability to even coherently talk about knowledge of the external world. Keith Allen has recently developed an argument that connects the account of sensitive knowledge as a perception of agreement between ideas, discussed in section 2.4 above, with this line of anti-skeptical response.

Section 2.4 considered an approach to reconciling Locke’s definition of knowledge with sensitive knowledge through Locke’s category of ideas of reflection. According to this approach, all of our ideas are stamped with an idea of reflection that tells us which of our mental faculties was responsible for producing the idea in our minds at that time. When we have a sensory experience of some object, like the crimson water fountain in section one, our idea of that object agrees with the idea of actual sensation, which itself agrees with the idea of real existence.

As Locke understands the kinds of skeptical doubts in the above mentioned passages, skepticism amounts to doubting the veracity of our ideas of reflection. That is, radical skepticism amounts to doubting that when an idea of a crimson water fountain is stamped with the idea of actual sensation, the idea of the crimson water fountain really is received through sensation. Instead, the idea might be produced in the mind by the mind itself recalling or imagining the idea (unbeknownst to itself). Ideas of reflection provide us with our only way of understanding our mind, however. We have no access to our minds or their activities other than through ideas of reflection. To doubt the veracity of an idea of reflection is therefore to doubt the very possibility of even talking about activities of the mind like knowledge. In doubting whether our ideas of reflection really do tell us about the activities of the mind, then, the skeptic renders useless all talk of knowledge whatsoever.

When Locke says that it matters not whether he replies to the skeptic, Allen argues, he is pointing out this way in which the skeptic’s argument is self-undermining. The skeptic’s goal is to challenge whether we have the knowledge we take ourselves to have. In raising the doubts that they do, however, the skeptic undermines their ability to talk about knowledge at all. Without being able to talk about knowledge the skeptic renders the very doubts they raise about knowledge empty and meaningless.

The force of this response depends on the strength of the skepticism confronted. This reply is only addressed at the most radical of skeptics—the kind of skeptic who challenges that the reflective idea of sensation tells us anything at all about the means by which an idea was produced in the mind. Such a skeptic doubts even the connection between a mind and its thought. A less radical skeptic may simply suggest that on any given particular occasion in which you think you have sensitive knowledge, you do not. A moderate skeptic of this kind would simply note that the reflective idea of sensation is not infallible, it is at best reliable. There are occasions, then, when an idea of a sensible object is stamped with a reflective idea of sensation but the idea of the sensible object is not in fact produced in the mind via sensation. This more moderate worry does not threaten to completely undermine our ability to understand our own minds via ideas of reflection but it does seem to undermine any given instance of sensitive knowledge. There need be no difference in terms of the ideas in your mind when you look up the hall and see a crimson water fountain and when you hallucinate one. Generally, it’s true, the latter will not be stamped with a reflective idea of sensation but on occasion it can be. So far as is possible to tell from one’s own subjective perspective, any given instance of sensitive knowledge might be one of the mistaken cases. Thus, though we cannot be generally mistaken about the existence of the external world we can be mistaken in any particular case.

d. Themes in Locke’s Responses to Skepticism

A theme that emerges from Locke anti-skeptical arguments is the way in which Locke’s account of what it is to have knowledge of the external world comes apart from how skeptical worries are to be engaged. Individuals can have sensitive knowledge even if they can’t draw on the lines of argument Locke himself develops in the Essay. Indeed, in no case is skepticism refuted, or proved wrong. Rather, the skeptic is pushed back with arguments that support a probable opinion that skepticism is mistaken. Locke makes this point explicit when it comes to his ‘concurrent reasons.’ They are reasons both independent of our sensitive knowledge as well as not capable of proving the skeptic wrong. Locke’s other lines of anti-skeptical argument carry the same theme.

Locke’s point that the skeptic can’t doubt their senses in practice emphasizes that even someone committed to skeptical doubts has sensitive knowledge. The force of this response, however, rests on claims about the fundamental nature and purpose of our cognitive faculties that seem beyond the scope of knowledge. Finally, the claim that the radical skeptic is self-undermining likewise divorces the having of sensitive knowledge from anti-skeptical argument. Even the radical skeptic, on this argument, is not so much refuted through a reductio ad absurdum argument as she is set aside as incoherent or not worth serious engagement.

A second theme in Locke’s anti-skeptical argument is that his primary emphasis is merely on the externality or distinctness from our mind of sensible objects. The skeptic Locke engages in the pages of the Essay is one who suggests that what seem to be sensory experiences are in fact nothing but the product of our own mind as a kind of dreaming or mere imaginations. So, even if Locke succeeds in rebuffing the worry that the mind itself is responsible for its sensory experiences, it is not clear how far that takes him against other nearby worries.

For example, Locke’s replies to the skeptic seem to leave us well short of knowing even that there is a distinctly physical as opposed to merely external world. To appreciate this issue and the fine line Locke attempts to draw, consider three claims that Locke holds. First, we cannot know the fundamental nature of any kind of thing, even the nature of our own minds. Second, we know the existence of things distinct from our minds. Third, we know the existence of physical objects (bodies) through sensation. These three claims encapsulate Locke’s rejection of a Cartesian account of the world and our knowledge of it. On a Cartesian view, not only do we know the existence of an external world but we also know its fundamental nature. Locke accepts, nevertheless, that we know bodies to exist distinct form our own minds as thinking things. The difficulty of making sense of Locke’s view can be highlighted by considering Locke’s position vis-a-vis idealist metaphysics. It is not clear, for example, how or whether the position Locke stakes out in holding these three claims is incompatible with an idealist metaphysics—such as Berkeley’s—that gives particular physical objects an existence external to and independent of any particular finite mind. Descartes’ position, on the other hand, stands in clear contrast with the metaphysics of Berkeley. Thus, though Locke’s reply to the skeptic may carry weight against someone who denies there is an external world, it is more difficult to understand how Locke can claim that we know physical objects to exist. Satisfactorily addressing this question for Locke takes us beyond the scope of this entry and into untangling the relationships between what Locke calls nominal essences, real essences, and substance.

4. Conclusion

Locke’s discussion of knowledge of the external world brings us to confront many of the central themes in Locke’s philosophy. Locke thinks of knowledge of the external world as sensitive knowledge of real existence. That is, it is knowledge that some object exists distinct from our mind and affects our mind by producing certain ideas in it. This knowledge is achieved through sensory experience. It is neither the result of reflecting on ideas already in our mind nor of deductively reasoning from some premises.

Integrating sensitive knowledge with Locke’s broader epistemology is no easy task. Locke’s definition of knowledge appears to make all knowledge a priori, but knowledge of the external world is patently not a priori knowledge like knowledge of mathematical truths—even by Locke’s own lights. It is empirical knowledge gained through experience. Locke nevertheless insists that we have sensitive knowledge. Efforts to understand the place of sensitive knowledge in Locke’s epistemology as a whole lead to probing not only important questions about his definition of knowledge—such as whether it really does make all knowledge a priori—but also his philosophy of mind and accounts of representation and mental content. Indeed, efforts on these issues have led to very radical rethinks of Locke’s entire philosophy such as Yolton’s effort to understand Locke’s theory of perception in direct perception terms.

Finally, Locke’s account of sensitive knowledge is intimately related to but significantly distinct from his reply to skepticism. Locke does not think that particular instances of sensitive knowledge—such as when you know that the paper (or screen) you’re reading from exists—depend on being able to defeat skeptical doubts. Indeed, Locke does not seem to think that the skeptic can be fully defeated or demonstratively proved wrong. Rather, skeptical worries can be pushed back using the probable arguments embodied in Locke’s concurrent reasons with sensitive knowledge. Our passivity in sensation and the coherence of our sensation seem to call out for explanation. The best explanation, Locke seems to think even though he does not explicitly argue the point, is the existence of an external world. Locke rejects other forms of skepticism as either grounded on unacceptable assumptions or else as containing the seeds of their own incoherence. Ultimately none of Locke’s anti-skeptical arguments are likely to convince a dug-in skeptic. But in this failure, Locke is surely not alone, even among other canonical figures in the history of philosophy.

5. References and Further Reading

a. Primary Texts

Locke, John. An Essay Concerning Human Understanding (ed. Peter Nidditch). Oxford University Press, 1975.
- This is the standard scholarly edition of the Essay. It includes an editorial system to note revisions made to the Essay from the first through sixth editions of the Essay as well as references to the translation of the Essay into French by Pierre Coste.
Locke, John. The Works of John Locke (ed. Thomas Tegg), 1823.
- The collection is contained in nine volumes and includes Locke’s writings and correspondence on many topics, from philosophy, to economics, to religion. Most relevant to this entry, Locke’s correspondence with Stillingfleet is in the fourth volume.

b. Secondary Literature

Entries are organized by topic, including the context of their mention in this entry.

i. Sensitive Knowledge as Incompatible with Locke’s Definition of Knowledge

Woolhouse, Roger. ‘Locke’s theory of knowledge,’ The Cambridge Companion to Locke, ed. Vere Chappell, p. 146-171. Cambridge University Press, 1994.
- This is a very accessible entry on Locke’s epistemology. Woolhouse spends time in the entry developing the distinct incompatibility between sensitive knowledge and the definition of knowledge.
Jolley, Nicholas. Locke. Oxford University Press, 1999.
- Jolley’s book is a short, easily approachable introduction to the whole of Locke’s thought. In the book Jolley not only develops an argument that sensitive knowledge is incompatible with Locke’s theory of knowledge but the broader point that the epistemology Locke develops in Book IV of the Essay is incompatible with the empiricist philosophy of mind and language developed in the first three books of the Essay.

ii. Sensitive Knowledge and the Semantics of Ideas

Ayers, Michael. Locke: Epistemology and Ontology. Routledge. 1993
- Ayers’ book is one of the most influential books in recent Locke scholarship and ranges over the whole of Locke’s metaphysics and epistemology. In many places Ayers attempts to draw connections between Locke’s views and current views and issues in philosophy. It is a near must-read for anyone interested in Locke’s theoretical philosophy. It also contains the formation of the blank effect view of the semantics of simple ideas and an explanation of how the blank effect view could help to make sense of Locke’s claims about sensitive knowledge.
Bolton, Martha. ‘Locke on the Semantic and Epistemic Role of Simple Ideas of Sensation,’ Pacific Philosophical Quarterly, Vol. 85, Issue 3, p. 301-321. 2004.
- Bolton’s article is a very clear development of the connection between the epistemic and semantic features of simple ideas of sensation mentioned in section 2.3. She also directly engages and discusses the ‘blank effect’ view from Ayers.
Ott, Walter. ‘What is Locke’s theory of representation?’ British Journal for the History of Philosophy, Vol. 20, Issue 6, p.1077-1095. 2012.
- Ott is a leading scholar on Locke’s theory of representation in both mind and language. This 2012 piece is a nice high level introduction to the issues surrounding Locke’s theory of representation and spells out in detail some of the possible ways of understanding externalist content for Locke’s ideas.

iii. Sensitive Knowledge as an Agreement between Ideas

Allen, Keith. ‘Locke and Sensitive Knowledge,’ Journal of the History of Philosophy, Vol. 51, Issue 2, p.249-266. 2013.
- Allen’s article is notable not only for its clear account of the way in which sensitive knowledge can be made compatible with Locke’s definition of knowledge but also for its very in-depth discussion of how that account of knowledge supplies Locke with a powerful response to the radical skeptic.
Nagel, Jennifer. ‘Sensitive Knowledge: Locke on Skepticism and Sensation.’
- Nagel’s article has been in circulation online for some time. After developing an account of sensitive knowledge similar to Allen’s, Nagel provides an in-depth account of how Locke develops the point that the skeptic cannot doubt her senses in practice. Nagel also provides useful historical context as to why Locke would’ve thought such a response to the skeptic powerful in light of the kind of skepticism that was popular in the late 17th century.
Newman, Lex. ‘Locke on Sensitive Knowledge and the Veil of Perception—Four Misconceptions,’ Pacific Philosophical Quarterly, Vol. 85, Issue 3, 273-300. 2004.
- Newman’s article on sensitive knowledge is a careful and methodical look at how sensitive knowledge is compatible not only with Locke’s definition of knowledge but also with attributing to Locke a representational (rather than direct) theory of perception. Newman’s article also contains very detailed argument in favor of the ‘between-ideas’ formulation of Locke’s definition of knowledge mentioned above in section 2.1.
Newman, Lex. ‘Locke on Knowledge,’ The Cambridge Companion to Locke’s ‘Essay Concerning Human Understanding,’ ed. Lex Newman, 313-351. Cambridge University Press, 2007.
- Newman’s general article on knowledge is very accessible entry point into Locke’s broader epistemology. It concludes with a shorter more easily digested presentation of the view of sensitive knowledge developed in the 2004 paper above.

iv. Locke and Direct Perception

Yolton, John. Locke and the Compass of Human Understanding. Cambridge University Press, 1970.
- Yolton’s book contains some of the earliest and clearest attempts to develop a direct perception interpretation of the Essay. Yolton also in this book develops the interpretation of Locke’s definition of knowledge as an agreement between an idea and some thing else, not necessarily an idea. Putting these two points together, Yolton argues that sensitive knowledge neatly fits within Locke’s definition of knowledge
Lennon, Thomas. ‘Through a Glass Darkly: More on Locke’s Logic of Ideas,’ Pacific Philosophical Quarterly, Vol. 85, Issue 3, p. 301-321. 2004.
Lennon, Thomas. ‘The Logic of Ideas and the Logic of Things: A Reply to Chappell,’ Pacific Philosophical Quarterly, Vol. 85, Issue 3, p. 356-360. 2004.
Lennon, Thomas. ‘Locke on Ideas and Representation,’ the Cambridge Companion to Locke’s ‘Essay Concerning Human Understanding,’ ed. Lex Newman, 231-257. Cambridge University Press, 2007.
- All three of these articles by Lennon develop in great detail both the textual and philosophical cases for a direct perception interpretation of Locke’s theory of ideas. The 2007 article from the Cambridge Companion is the most accessible of the bunch and takes on some of the most straightforward objections to the theory.

v. Sensitive Knowledge as Assurance

Rickless, Samuel. ‘Is Locke’s Theory of Knowledge Inconsistent?’ Philosophy and Phenomenological Research, Vol. 7, Issue 1, p. 83-104. 2008.
Rickless, Samuel. ‘Locke’s “Sensitive knowledge”: Knowledge or Assurance?’ Oxford Studies in Early Modern Philosophy, Vol. 7. Forthcoming.
- Rickless’ articles provide a sustained, thorough, and creative argument for the claim that Locke does not really think that sensitive knowledge is a kind of knowledge. Rickless provides both textual and philosophical motivation for his interpretation. The second, forthcoming article addresses some of the criticisms that have been made of his view by Allen, Nagel, and Owen.
Owen, David. ‘Locke on Sensitive Knowledge.’
- This is an unpublished manuscript from David Owen, a leading scholar in early modern philosophy. This article is an accessible argument against Rickless’ assurance interpretation.

vi. Locke’s Account of Knowledge as an Analysis

Mattern, Ruth. ‘Locke: “Our Knowledge, Which All Consists in Propositions”.’ Canadian Journal of Philosophy, Vol 8, 677-695. 1978. Reprinted in Locke, ed. Vere Chappell, p. 266-241. Oxford University Press, 1998.
- Mattern’s article marks an important first attempt to understand Locke’s definition as compatible with sensitive knowledge on the grounds that the definition of knowledge is just a statement that knowledge is grasping the truth of a proposition in the Essay’s terminology of ideas.
Soles, David. ‘Locke on Knowledge and Propositions,’ Philosophical Topics, Vol. 13, Issue 2, p.19-29. 1985.
Soles, David. ‘Locke’s Empiricism and the Postulation of Unobservables,’ Journal of the History of Philosophy, Vol. 23, Issue 3, p. 339-369. 1985.
- Both of Soles’ articles, but especially the first listed above, very clearly articulate the difference between offering an analysis of knowledge and defining knowledge by describing the subject matter of knowledge. Soles clearly articulates the distinction and how understanding Locke’s definition of knowledge as an analysis makes it clearly compatible with sensitive knowledge.

vii. Sensitive Knowledge and Skepticism

Weinberg, Shelley. ‘Locke’s Reply to the Skeptic,’ Pacific Philosophical Quarterly, Vol. 94, Issue 3, p.389-420. 2013.
- Weinberg’s article develops the distinct way in which sensitive knowledge is non-inferential. In light of the non-inferentiality of sensitive knowledge, Weinberg goes on to discuss the lines of response open to Locke.

viii. Further Reading

For those with further interests in the topics of Locke on perception or sensitive knowledge, it is worth reading a special issue of Pacific Philosophical Quarterly edited by Vere Chappell on the topic of Locke’s veil of perception. Several entries above are from this edition—Volume 85, Issue 3. What year? In addition to the entries listed above, there is an introduction to the volume and commentary on each article from Vere Chappell.

Bolton, Martha, ‘The Taxonomy of Ideas in Locke’s Essay,’ The Cambridge Companion to Locke’s ‘Essay Concerning Human Understanding,’ ed. Lex Newman, 67-100. Cambridge University Press, 2007.
- An accessible general introduction to Locke’s theory of ideas. In the discussion of Locke’s account of the representational content of ideas, it was noted that Locke takes modes and relations to be mind-dependent. For more on the difference between simple ideas and substances ideas, on the one hand, and ideas of modes and relations on the other, it is helpful to look at Locke’s discussion of what he calls the reality, adequacy, and truth of ideas.
Stuart, Matthew. Locke’s Metaphysics. Oxford University Press, 2013.
- For those interested in more on Locke’s metaphysics, including the mind-dependence of modes and relations, a recent work with an exceptional ability to bring contemporary analytical tools to Locke’s philosophy.
Newman, Lex (editor). The Cambridge Companion to Locke’s ‘Essay Concerning Human Understanding.’ Cambridge University Press, 2007.
- A broad look at several topics in Locke’s theoretical philosophy, including several articles relevant to Locke’s discussion of nominal essence, real essence, and substance. Thearticles by Ed McCann, Margaret Atherton, Michael Losonsky and Lisa Downing are especially relevant to questions of the way in which Locke can make sense of the claim that there is a physical world external to and distinct from our minds.

Author Information

Matthew Priselac
Email: mdpriselac@ou.edu
University of Oklahoma
U. S. A.

Filial Obligation

The question of what one should do for one’s parents is often urgent; a parent needs care in the near future, and the grown child must decide what kind of care to provide, whether and to what extent to finance the provision of care, and to what extent the child ought to sacrifice his happiness, wellbeing, financial security, guest bedroom, and so forth for the sake of his parent. These questions are made murkier by shifting family structures, varying closeness—both past and present—between the parent and child, and conflicting obligations, such as those to one’s own children or partner or both. To make matters worse for those facing these questions, the problem is relatively new in the philosophical literature.

Despite the urgency of the problem, few philosophers have directly engaged with the question of filial obligations. Although several briefly mention this question and sketch a few initial considerations regarding it, only a handful of contemporary philosophers have attempted to articulate a theory of what one owes one’s parents. In what follows, five such theories are presented and critiqued: Debt Theory, Friendship Theory, Gratitude Theory, Special Goods Theory, and Gratitude for Special Goods Theory.

The Problem
Debt Theory
1. The View
2. Criticism
Friendship Theory
1. The View
2. Criticism
Gratitude Theory
1. The View
2. Criticism
Special Goods Theory
1. The View
2. Criticism
Gratitude for Special Goods
1. The View
2. Criticism
References and Further Reading

1. The Problem

Though the family is a well-established institution about which much has been said, the current state of the parent-child relationship is a relatively new phenomenon. First, the family structure itself has shifted so that there now exists a wide and often confusing array of family unit types. Social roles have become difficult to determine, as have the social obligations attached to and defined by those social roles. Second, life expectancy in most first world nations is considerably longer than it has been in the past, so caring for elderly parents has only recently become a long-term commitment. Third, the care required to reach, and possibly enjoy, that longer life expectancy is often expensive. Fourth, the birth rate has declined, such that each parent has, on average, fewer children to provide the necessary care than in previous times. In short, longer-living parents have fewer children who might share the increasing financial burden of caring for the aging parent for longer periods of time, and shifting family structures obscure rather than clarify the role each family member should play.

Despite its intuitive appeal, the idea that we might have special obligations to someone in virtue of our relationship with them is a philosophically problematic one. Filial obligations are a particularly puzzling subset of special obligations. If they exist, some of their likely features seem striking. For example, filial obligations are generally owed to people with whom one’s relationship is largely non-voluntary. Yet, obligations might arise from non-voluntary situations that do not involve a parent-child relationship. For instance, if you are the person uniquely situated to save the drowning toddler in the Shallow Pond example, you have a rather stringent obligation, though the relationship that has generated is non-voluntary. However, the parent-child relationship is not merely non-voluntary. Rather, for much of the relationship, voluntariness is asymmetrical; parents chose to enter into a parent-child relationship, whereas children (and presumably the drowning toddler, for that matter) did not make a similar choice. The current state of one’s relationship with one’s parents might be voluntary; the child might choose to engage (or not) with his or her parents, exchange benefits, and so forth. However, if one’s parents made tremendous sacrifices on one’s behalf, many of which one did not request and of which one was unaware, then one owes one’s parents something in response to benefits that one did not voluntarily seek and in many cases was not free to reject.

Furthermore, if we owe our parents something because they fed, clothed, and sheltered us, then it seems as though we have obligations in response to acts which were themselves morally required. Our parents are required to feed, clothe, and shelter us, at least for some time. Why would we owe anything in response? After paying taxes, we do not owe the state a “thank you” for the services it provides, nor does the state owe us a “thank you” for our tax dollars. How could moral obligations arise from provisions that are morally required of the benefactors and are not voluntarily sought or accepted by the beneficiary, particularly when the benefactors voluntarily chose to become obligated?

Moreover, filial obligations raise questions about distributive justice, both theoretical and practical. Theoretical questions arise because filial obligations seem to confer special advantages on individuals with children—or at least those concerned to fulfill such obligations—that childless individuals—or individuals with children less concerned to fulfill such obligations—do not similarly enjoy. Furthermore, these advantages compound an already existing advantage: the parent-child relationship is itself a benefit to both parties when all goes well.

Practical questions arise about how filial care ought to be distributed. Such care falls disproportionately on women, which seems to violate any reasonable demands of justice. For instance, if we consider equal opportunity as a guiding principle of justice, and filial care interferes with women’s access to and participation in the workforce, then filial care as it has been traditionally practiced violates this important demand of justice. Furthermore, insofar as women “perform the majority of housework chores and function as the primary parent for small children,” perhaps women are owed more; that is, filial obligations to one’s mother may be more extensive than those to one’s father (Jecker 2002).

2. Debt Theory

a. The View

According to Debt Theory, one owes repayment to one’s parent for whatever investment of resources the parent has made on one’s behalf, regardless of the parent’s needs or child’s ability, unless the parent releases the child from the debt. According to Debt Theory, as articulated here, children have specific obligations to their parents: they must repay their parents’ “investment” in child rearing. Parents contribute resources to raising children, including time, money, energy, and so on. Each of these resources could have been devoted to something other than raising a child. Consequently, the parent has fewer resources than he or she would otherwise have. The child therefore owes repayment of the debt.

Because this views the parent-child relationship as analogous to the creditor-debtor relationship, the circumstances under which a child would not be required to repay the investment are similar to those under which a debtor might be released from his obligation of repayment to the creditor. For example, a parent might release his child from this obligation of repayment much like a creditor might release a debtor. However, as in the case of the creditor-debtor, neither the parent’s needs nor the child’s ability determine the content of the child’s obligation. The child owes repayment regardless of his ability to repay the “loan” and regardless of whether the parent needs to be repaid. Just as the debtor’s obligation of repayment is not contingent on an ongoing, mutually beneficial relationship, the child owes repayment regardless of the nature of his relationship with his parent. Filial obligations arise from and are determined by the parent’s investment in rearing his or her child.

b. Criticism

As stated, Debt Theory is quite simple and, perhaps as a consequence of this simplicity, many critics think it is wildly implausible. Philosophers who advance a particular account of filial obligations often begin by rejecting Debt Theory. However, much of the critical discussion about it focuses on problems with an unarticulated and undefended position.

Although Confucius, Aristotle, and Thomas Aquinas each discuss filial obligations in the language of debt repayment, the view is not that children owe repayment of a “loan.” Rather, given what parents do for children, including bringing them into existence, children owe them gratitude and piety. Jan Narveson (1987) offers an account of filial morality that resembles Debt Theory, but on his view, the reason children ought to repay their parents is that it is beneficial for them to behave in ways that make parental investments rational and thereby encourage parents to make such investments. Similarly, despite “debt” language present in both Jeffrey Blustein’s Parents and Children and Philip J. Ivanhoe’s “Filial Piety as a Virtue,” both offer accounts of Debt Theory where the debt is one of gratitude or respect rather than a straightforward debt of repayment. Furthermore, Blustein suggests that those who use the “owing idiom” in fact “confuse gratitude with indebtedness” (Blustein 1982). Li (1997) has also added to these views.

In line with Blunstein and Ivanhoe’s accounts, the historical models for Debt Theory are not versions of the theory articulated here. Rather, they more closely resemble Gratitude Theory. Nonetheless, Debt Theory often serves as a useful starting point in discussions about filial obligation. Jane English, who first articulated and endorsed Friendship Theory, and Simon Keller, who did the same for Special Goods Theory, both begin by offering objections to Debt Theory. This shapes the content of their preferred theories of filial obligation. They both argue that the Debt Theory of filial obligations makes a faulty analogy between the parent-child and creditor-debtor relationships, and that this analogy ignores morally relevant features of the parent-child relationship. Despite this, critics disagree on what, if any, relationship is more closely analogous to the parent-child relationship and what that analogy, or lack thereof, implies about our filial obligations.

3. Friendship Theory

a. The View

According to Friendship Theory, children ought to do for their parents what they would do for friends with whom they share a voluntary, caring relationship. These obligations depend upon the needs and abilities of both child and parent, as well as the current state of the relationship. If the parent and child do not share a voluntary, loving relationship, then the child has no filial obligations. According to Friendship Theory founder Jane English, the entire language of the Debt Theory is problematic, as children do not, strictly speaking, owe their parents anything. The parent-child relationship is unlike the creditor-debtor relationship in that it is characterized by feelings of love and voluntary friendship.

The source of obligation between friends does not usually arise from one friend being in debt to another friend. If, for example, one friend does a favor for the other, then the beneficiary of the favor owes repayment. However, English argues that favors generate debts, but favors are fundamentally different from what friends generally do for one another. Similarly, favors are fundamentally different from most of what parents do for their children.

Unlike creditors, friends are, or ought to be, motivated by care. Indeed, we would be troubled to discover that our friends were keeping track of what nice things they had done for us and what nice things we had done for them, keeping a watchful eye out for any imbalances that might arise. According to Friendship Theory, we ought to focus not solely on the cost to the parent or on the benefit to the child, but also on the relationship in which the benefit arises. Consider a case in which a parent invests in her child’s education. If the child has asked for this payment, then the investment is a favor and the child owes repayment. If, however, the parent simply wants a good education for her child and offers to pay, then the child does not incur a debt of repayment by accepting the parent’s offer. The parent’s investment is made out of care for the child, and although the child would do well to help her parent later, the child does not owe the parent anything. Furthermore, just as we would question the friend who does nice things for us with the expectation of repayment, we might question the parent who invests in her child with an eye on repayment later. Unlike the creditor-debtor relationship, in which balance sheets are expected and appropriate, friendships are relationships where balance sheets that record kind acts would indicate the relationship’s failure. In this regard, the parent-child relationship is, or ought to be, more like a friendship than like a creditor-debtor relationship.

Returning to the case of the parent who pays for her child’s elite education, the difference between Debt Theory and Friendship Theory is clear. According to Debt Theory, the child owes repayment regardless of the parent’s financial situation. Just as a debtor owes a creditor regardless of the creditor’s financial situation, the child owes repayment for the expensive education, even if the parent’s financial investment did not constitute any significant sacrifice. On Debt Theory, if multibillionaire parents pay $200,000 for their child’s education, that child owes repayment, despite her parents’ staggering wealth. To put this in English’s own terms, if these parents do a “favor” for their child by paying $200,000, then the child owes repayment of the favor. Whether the favor constitutes a sacrifice is irrelevant in determining whether obligations arise from the favor. Yet, the degree of sacrifice might be relevant in determining the content of those obligations. Similarly, according to Debt Theory, the grown child’s abilities are irrelevant. If the child is unable to repay the parent, the child has simply defaulted on her obligations.

According to Friendship Theory, however, the multibillionaires’ child ought to be kind, express concern for, and generally continue a caring relationship with her parents, but the child need not contribute to their financial resources. For example, the child ought to call or visit her parents on their birthdays, or ensure that the relationship continues in some way. However, since her parents are financially capable of providing for their own care in old age, Friendship Theory tells us that the multibillionaires’ children do not have an obligation to provide such care. On this view, the needs, abilities, and resources of both parties shape the content of filial obligations.

According to Debt Theory, these obligations do not diminish or disappear if the parent-child relationship is terminated. The obligation of repayment might become stronger since the parent no longer enjoys benefits such as participating in the relationship, which she might have enjoyed if the relationship had continued. According to Friendship Theory, however, since the source of the obligations is not the investment but rather the relationship itself, filial obligations diminish or disappear if the relationship diminishes or disappears. To maintain that filial obligations exist without a voluntary, loving parent-child relationship is to ignore morally relevant differences between what the parent does for her child and what the creditor does for the debtor. Appropriately, the creditor acts with the expectation of repayment. The parent does not, or at least should not, act with this expectation.

b. Criticism

Two types of criticism arise in response to Friendship Theory. First, critics argue that focusing on the current state of a relationship to determine whether filial obligations arise makes those obligations too easily avoidable, thereby licensing filial ingratitude. Second, as with Debt Theory, the relationship analogy fails. Just as the parent-child relationship is different from the creditor-debtor relationship in morally significant ways, it also differs from friendships in morally significant ways.

The first type of objection challenges English’s claim that filial obligations diminish or disappear entirely when the relationship dissolves. Consequently, critics argue, filial obligations are too easily avoidable, and cases of filial ingratitude appear unproblematic. In his criticism of Friendship Theory, Simon Keller (2007) presents this objection as follows: “You cannot explain your failure to look after your parents by saying, ‘Look, they’re great people, and I’ll always value the times when we were close, but over the years we’ve taken different paths. I went my way, they went theirs, it seemed like the relationship wasn’t taking us where we wanted to go . . . things just aren’t the way they were.’ You are stuck with your filial duties, in a way that you are not stuck with your duties of friendship”. Although English embraces this conclusion, and argues that repayment after the relationship ends might indicate a lack of respect for the relationship, her critics find it a compelling reason to reject the entire model.

As stated, the second objection is that the parent-child relationship is different from a friendship in morally significant ways, and thus a theory of obligations between friends cannot serve as a theory of filial obligations. According to Joseph Kupfer (1990), not only is it unlikely that parents and children can be friends, it is undesirable. Parents and children cannot be friends because they are not equals within the relationship and they lack sufficient independence from one another to become equals. This lack is not, however, a problematic feature of the parent-child relationship. Rather, it is constitutive of a healthy parent-child relationship. Thus, rather than a friendship, the parent-child relationship is, or at least begins as, a relationship between unequal partners since parents shape who the child will become.

This history of unequal autonomy effectively eliminates any possibility that equality will be restored later in the relationship; that is, inequality begets further inequality. This is accomplished in two ways. First, the child’s self-concept is shaped by her history of unequal autonomy. Thus, the child’s self-concept is likely to include diminished autonomy with respect to the parent-child relationship. Second, the child’s history with her parent forms habits of deference and respect toward the parent. Just as the friend who has less autonomy in the context of the friendship, the grown child is less likely to make decisions, offer opinions, or resist the conclusions of the more autonomous partner in the relationship. Given this history, and the likely effect it has on the prospects for equal autonomy in the future, Kupfer concludes that parents and children cannot and should not be friends. Therefore, Friendship Theory is a poor model for filial obligations.

4. Gratitude Theory

a. The View

According to Gratitude Theory, one owes gratitude to one’s parent in response to the parent’s benevolence toward the child, so long as this gratitude serves to support rather than undermine relationships of mutual respect. Although this has not been defended as an independent theory of filial obligation, gratitude theorists suggest that it is the natural grounding of such obligation. Fred Berger (1975), for instance, develops an account of gratitude, and briefly considers how it applies to the case of grown children:

The sort of continual sacrifice and caring involved in a decent upbringing is not reciprocated to parents by a warm handshake at the legal age of independence. While the notion of gratitude to one’s parents can easily be overdone, it is clear enough that an adequate showing of gratitude to them cannot be made with mere verbal expressions … It is very hard to say just what is appropriate, and it may be that there can be no answer in the abstract … It is clear, however, that a handshake or kiss on the cheek normally will not do.

Given Berger’s own account of gratitude, as well as Claudia Card’s (1988), the Gratitude Theory provides five considerations that aim to determine when obligations of gratitude arise and what such obligations entail. The five considerations are as follows:

Gratitude is a three-part relation: X is grateful to Y for Z.
Gratitude is generally warranted in response to another’s benevolence.
Gratitude is not something a benefactor has a right to, even though the beneficiary may owe it.
The beneficiary’s debt of gratitude is a relatively informal obligation.
Obligations associated with gratitude might be impossible to fulfill, though that does not imply that the obligation is itself overly demanding.

Before extending this theory to filial obligations, let us look briefly at each component.

First, gratitude is a three-part relation. Genuine gratitude is directed toward someone, and it is for something. One might be very glad to enjoy certain benefits, even when no one is responsible for providing them. Genuine gratitude, then, requires someone to whom one can be grateful.

Second, gratitude is in response to the motivations of another person, not only to the benefits or perceived benefits that person might provide. Berger (1975) articulates this consideration as follows: “Gratitude, then, does not consist in the requital of benefits but in a response to benevolence; it is a response to a grant of benefits (or the attempt to benefit us) which was motivated by a desire to help us.” To see why one might think that obligations of gratitude arise only in response to certain motivations, consider a case in which someone undertakes some action that benefits a friend, but this person does not foresee the benefit to her friend, and might have acted differently if she had foreseen this result. The friend does not owe gratitude for accidental benefits; if the person intended not to benefit her friend, gratitude seems inappropriate. Alternatively, if someone tries to benefit her friend but fails, the friend can be grateful for the effort, even though the effort yields no actual benefits.

Third, the benefactor has no right to gratitude, even if gratitude is owed. If a man is drowning and a passerby risks her life to save him, he certainly ought to be grateful to her. If he is not, she may rightfully feel that she has been mistreated, and third parties may rightfully judge him to be reprehensibly selfish. Even so, this Good Samaritan has no right to his gratitude such that she or third parties could require that he experience or express it or both.

Fourth, obligations of gratitude are relatively informal. Unlike debt repayment, the terms of these obligations are imprecise and flexible. In some cases, appropriate gratitude might be expressed with a “thank you,” whereas in others, gratitude might require greater sacrifice. Furthermore, obligations of gratitude may change over time if the relationship between the benefactor and beneficiary changes.

Fifth, the obligations might be ongoing and impossible to fulfill, but this does not mean that they are overly demanding. Consider again the case a passerby risking her own safety to rescue a drowning man. A mere “thank you” might not serve as a sufficient expression of his gratitude. This is the sort of case where it seems appropriate for him to say, “I can’t ever thank you enough.” Even if this is true, it does not mean that he owes the passerby lavish gifts, constant praise, or a first-born child. It means only that his gratitude ought to be ongoing. This is not necessarily a demanding obligation, however. Gratitude might require only that he thank her and continue to behave kindly toward her. On Berger’s view, expressions of gratitude need not be proportional to the benefits bestowed, for the motivation of the benefactor rather than the benefits themselves ground the obligation.

According to Gratitude Theory, the obligations one has to one’s parents are based on gratitude, and fulfilling those obligations serves as an expression of one’s gratitude. As in other moral relationships, gratitude in the parent-child relationship is not always appropriate. Parents who invest heavily in their child’s education might bestow substantial benefits on the child. However, the child may have no obligations of gratitude toward his parents if they sought to bestow such benefits exclusively for self-serving reasons. Berger cites fictional cases in which parents aim to keep their family in good social status and, driven by this aim, they try to secure as many benefits as possible for their child. Here, the child may feel grateful to his parents for these benefits. Nonetheless, the child does not owe gratitude. Because it is a response to benevolence, the child owes gratitude only if the parents attempt to bestow these benefits on the child for her benefit; the gratitude is in response to the parents’ motivations rather than the benefits themselves. When benevolent motivation is absent, there is no obligation of gratitude.

Gratitude Theory does not distinguish between those benefits the child voluntarily accepts and those the child does not. Blustein (1982) explains this feature of gratitude as follows:

That we did not request those services does not itself entail that we have no duty to show gratitude for them. Indeed, since gratitude is essentially a response to benevolence, it seems that we may have a duty to show gratitude (at some point) for benefits that we did not voluntarily accept but only received, and for benefits which, at the time they were provided, were judged to be benefits by the grantor alone, and not by the recipient.

Gratitude Theory does, however, distinguish between the lack of voluntary acceptance and a preference to not be the recipient of another’s generosity. In her discussion of Gratitude Theory, Card notes that gratitude may not be owed if the benefactor has disregarded the beneficiary’s wishes and cautions against confusing generosity with benevolence. Generosity, Card (1988) explains, “can be accompanied by insensitivity to others’ wishes with regard to becoming obligated” whereas “[g]enuine benevolence is incompatible with disregarding others’ willingness to become obligated. Those who lack such regard thereby lack respect.”

Furthermore, children may owe gratitude even for those benefits the parent was morally obligated to provide, such as food and shelter. After all, the parents need not provide food and shelter from a motive of duty. Rather, the parents may meet the child’s needs precisely because the parents have concern for the child’s wellbeing and want to benefit the child.

Appropriate expressions of gratitude are those that support rather than undermine the mutual respect necessary for moral relationships, and this constraint serves as an upper limit on the demands of such an obligation: whatever an obligation of gratitude requires of us, it cannot require that we forfeit our autonomy. Autonomy, according to Berger, entails broad control over the shape of our own lives. Berger (1975) offers the following justification for such a limit: “To treat someone as a person in his own right entails granting him the right to work out the plan of his life as he sees fit.” Mutual respect requires that both the parent and the child grant each other, and themselves, the right to work out the plans of their lives. Any infringement on or expectation that one will forfeit that right undermines respect within the relationship. In such a case, the parties neither see each other nor themselves as persons in their own right but rather as means to another’s ends. Because Berger considers mutual respect a necessary condition for moral relationships between persons, neither the expectation nor the expression of gratitude should undermine that respect, as doing so would harm the relationship.

Gratitude Theory is distinct from Friendship Theory in important ways, though many of the demands might overlap. As the filial ingratitude objection suggests, obligations of gratitude can extend beyond friendship, for even after a friendship dissolves, obligations of gratitude for past benevolence may persist. The basis for obligations of friendship is the friendship itself, whereas the basis for obligations of gratitude is benevolence. Although parents may behave benevolently toward their children because of a relationship that resembles a friendship, the friendship itself does not ground obligations of gratitude.

Returning to the example of those benefits the parent is required to provide, such as food and shelter, the difference between the two theories becomes clear. At the time that parents are morally obligated to provide such things—for instance, when the child is very young—a friendship might be forming such that in later years the two will have a relationship analogous to a friendship. At the moment, however, the child might be too young for the parent-child relationship to be comparable to even a non-ideal friendship. Although a friendship might form later, where obligations of friendship would then arise, the grounds for an obligation of gratitude might already be present. Whether or not a friendship later emerges between the two, the obligation of gratitude remains. Because the grounds for the obligations are distinct, so are the theories.

b. Criticism

Simon Keller and Brynn Welch each offer criticisms of Gratitude Theory. Consider the following example: two friends help a mutual friend move into a new house, but one finds moving enjoyable while the other finds it onerous. According to Keller, the beneficiary incurs a stronger or more extensive obligation of gratitude toward the friend who finds the process onerous. Analogously, a child owes more to a parent who sacrificed a great deal than she would owe to a parent whose sacrifice was less substantial. Keller finds this an unacceptable consequence of Gratitude Theory.

Keller also objects that filial obligations, unlike obligations of gratitude, are ongoing and open-ended, and if filial obligations were grounded in gratitude, we would have no gesture that would capture that gratitude. Sending a card or flowers seems laughably insufficient as a demonstration of gratitude. Expressions of such gratitude are made more difficult by the fact that we generally think that for most acts of benevolence, a card or flowers discharges our duty of gratitude; we have shown our appreciation for the benefactor’s benevolence, and nothing more is required of us.

Welch argues that Keller’s objections to Gratitude Theory are based on a mistaken or uncharitable interpretation of the view or both, and the theory can survive those objections. However, she argues that the theory does not offer any action-guiding principles and so cannot answer the question it seeks to answer: what do I owe my parents? This is not simply the claim that the theory does not specify the content of filial obligations, but rather it does not even tell us what sort of action is required. Gratitude Theory might require no action at all but only a certain emotional experience, namely the experience of being grateful to the benefactor. If the theory offers a range of possible expressions of gratitude, we will still require some guidance as to the appropriate range of actions; that is, we still need to know whether something like a thank-you or flowers would be appropriate, or whether something like paying for expensive medical care is required. If Gratitude Theory requires a particular emotional attitude, then it must answer well-known problems associated with requiring emotional experiences.

5. Special Goods Theory

a. The View

In response to what he sees as failures of Debt, Friendship, and Gratitude theories of filial obligation, Simon Keller offers Special Goods Theory. It has three conditions:

If (1) a parent needs some special good, (2) the parent has provided or currently provides special goods to the child, and (3) the child is able to provide the special good that the parent needs, then the child ought to provide that special good to the parent.

In contrast to the previous theories, Special Goods Theory of filial obligations focuses on the benefits to the child and the needs of the parent. Specifically, this theory states that the parent-child relationship is one that makes possible certain special goods. According to Keller (2006, 2007), special goods are those that “contribute to individual welfare, meaning that they are goods that benefit an individual, or that contribute to her well-being, or her best interests” and which “the parent can receive from no one (or almost no one) but the child, or the child can receive from no one (or almost no one) but the parent”. Generic goods, on the other hand, are those that can easily arise from other sources.

In making his case for Special Goods Theory, Keller (2007) states six intuitions about filial obligations and argues that his theory best explains those intuitions. He states them as follows:

“Filial duties are ongoing and open-ended; they are not duties that can be discharged once and for all.”
“The nature and extent of your filial duties do not vary with the exact nature or quantity of parental sacrifice involved in your upbringing; you do not have lesser filial duties for having been easy to raise.”
“Filial duties are not easily avoidable; the moral relationship from which they arise is not one that you choose to enter, nor one that you can simply choose to end.”
“But [filial duties] do vary with certain changes in your ongoing relationship with your parents; if your parents unreasonably disown you, for example, then your filial duties may not be what they were.”
“The demands made by filial duty do not extend so far that meeting them impedes your ability to exercise a reasonable amount of autonomous choice over the shape of your own life; you do not have filial duties to (for example) pursue a particular career, follow a particular religion, or give more financially than you can reasonably afford.”
“Filial duties can be, in a different respect, very demanding; if you can afford to pay for your parents’ medical care, for example, then filial duty can require you to do so, even if it is very expensive.”

Keller concludes from these intuitions that the parent-child relationship and, consequently, filial obligations, are unique. Unlike Debt theory and Friendship theory, Keller begins with the assumption that the parent-child relationship is not analogous to any other kind of relationship. The intuitions are worth discussing further, as Keller often appeals to them as justification for his theory of filial obligation.

First, Special Goods Theory can explain why filial obligations cannot be fulfilled once and for all. Satisfying the three conditions has no theoretical limit; consequently, the obligation can be ongoing. To determine a child’s filial obligations, we consider only whether the circumstances satisfy these three conditions; we do not consider whether these three conditions have been satisfied already. Fulfilling obligations in a particular instance does not preclude the continuous satisfaction of the conditions. Any time these conditions are satisfied, filial obligations arise, and the conditions can remain satisfied for as long as the parent is alive.

Second, Special Goods Theory explains why one does not have less extensive or fewer filial obligations for having been easy to raise. The extent of one’s obligations depends entirely on the three conditions being satisfied, not on the extent to which the third condition—that the child is able to provide the special good that the parent needs—is satisfied. Provided that a parent requires some special good and has in the past provided special goods to his or her children, and that the children enjoy reciprocal relationships with their parent, then any differences between what the children owe will result from differences in their abilities to provide for the parent.

Third, Special Goods Theory explains why we cannot easily escape filial obligations. Whether the conditions for filial obligations are satisfied is, to a large extent, out of our control. We cannot alter our parents’ needs, nor can we undo that they have previously provided us with special goods. Further, we cannot escape filial obligation simply by terminating the relationship. Doing so will not effect whether the conditions for filial obligations are satisfied.

Fourth, the nature of one’s relationship with one’s parents can shape the content of one’s filial obligations. Consider a case in which a mother has terminated her relationship with her child. Regardless of whether this action was justified, we can now reasonably make certain claims regarding her child’s filial obligations. For example, if the mother does not wish to speak to her child, then her child is no longer positioned to provide the good in question; that is, staying in touch with her mother. The child, however, may still have other filial obligations. Yet, these may also depend on the current nature of the relationship, for it can shape the content of the obligations.

Fifth, filial obligations are not so extensive as to impede one’s ability to exercise autonomy. The third condition of filial obligations is that one is uniquely positioned to provide certain goods. The theory does not include the further requirement that one positions oneself in order to provide such goods.

Finally, Special Goods Theory can explain why filial obligations can be demanding. The child has an obligation to provide expensive long-term care for the parent if the child can provide it and if the parent has provided special goods to the child in the past or at present. According to the conditions of this theory, the extent of the obligation depends on the extent of the need and the extent to which the child can provide the required special goods.

This theory can potentially generate a wide range of filial obligations, from virtually costless to oppressively demanding, for the three conditions could continue to be satisfied so long as the parent and child are alive. Nothing about discharging the obligation in a particular instance precludes the conditions from being satisfied again, thereby generating new obligations. Moreover, in societies that do not provide care for their ageing members, long-term care is a special good, for it is unlikely to be provided by a source outside of the relationship. In such a society, the child’s obligations might be extensive simply because of the parent’s needs. Importantly, though, this theory clearly tells us what our filial obligations are: we ought to provide our parents with the special goods they need, provided they have provided those goods to us in the past.

b. Criticism

According to Welch, Special Goods Theory does not respond appropriately to the relationship’s moral considerations, specifically those regarding what the parent deserves. Keller says that we owe our parents special goods in the context of reciprocity. According to Welch, there are three plausible interpretations of this restriction but none suffice to avoid his moral objection. The first interpretation is that parent-child relationships are, when things go well, reciprocal insofar as both the parent and child benefit from the relationship. At least, the child benefits during his early years and the parent in her later years. Yet, if Keller only means that the relationship is reciprocal in this minimal sense, he cannot justify his fourth intuition: filial obligations “vary with certain changes in your ongoing relationship with your parents; if your parents unreasonably disown you … then your filial duties may not be what they were.” If a reciprocal relationship requires only that the parent-child relationship is or was mutually beneficial, then so long as the parent has provided the benefits to the child in the past, the current state of the relationship is irrelevant except insofar as it affects a child’s ability to provide special goods to the parent. The current state of the relationship does not necessarily determine whether the relationship is reciprocal.

The second interpretation is that a reciprocal relationship might require ongoing reciprocity; that is, the parent and child enjoy a reciprocal relationship so long as each continues to benefit. This interpretation would justify the intuition in question, for if the relationship changes and is no longer reciprocal, then the child’s filial obligations would also change. This interpretation suggests, however, that filial obligations no longer exist once the parent cannot contribute to the relationship.

Welch argues that this is problematic by offering the following example: in a society in which care for elderly persons is the responsibility of private citizens, an elderly woman suffering from dementia requires medical care, and she has a wealthy daughter who can provide such care. Yet, because the mother is physically and mentally incapable of contributing goods to the relationship, the relationship is no longer reciprocal. The daughter is wondering whether she has an obligation to provide such care for her mother, since her mother needs the care, has provided care in the past, and the daughter can provide the care. It would seem remarkably callous of the daughter to think to herself, “I have no obligation to provide the care my mother needs because, despite her care for me in the past, she no longer contributes to a reciprocal relationship.” Welch argues that filial obligations do not disappear simply because the parent is currently unable to provide special goods to the child. Thus, although the current state of the relationship would, on this interpretation of Keller’s reciprocity, determine the daughter’s filial obligations, it would do so counter-intuitively.

According to Welch, Keller’s claim that filial obligations arise “within the context of a reciprocal relationship” cannot limit filial obligations in the way Keller suggests. Either the parent-child relationship is reciprocal so long as it is now or was mutually beneficial, or it is reciprocal only when the mutual benefits are ongoing. In the former case, the relationship’s current state would determine filial obligations only insofar as it effects what the child can provide. Here, filial obligations are theoretically unlimited, regardless of the current state of the relationship. In the latter case, filial obligations diminish or disappear if the parent is no longer able to provide special goods to the child, even if this inability is not by choice. Here, filial obligations are unreasonably limited because children have obligations to their parents only so long as the children continue to benefit.

Welch considers a third interpretation of Keller’s reciprocity limitation on filial obligations. One could say that the parent-child relationship is reciprocal so long as:

a) The parent has provided special goods in the past, and continues to do so, or;
b) The parent has not provided special goods in the past, but does so now, or;
c) The parent has provided special goods in the past but now cannot provide them because, through no fault of her own, she is unable to do so.

This attempt to rescue the “reciprocal relationship” limitation on filial obligations appears ad hoc. Why think the relationship is no longer reciprocal only because parents fail to “make a reasonable effort to play their part in the relationship”? If parents fail to play their part—though perhaps not by choice—the relationship is no longer reciprocal. Thus, although the reason the relationship is no longer reciprocal is relevant for determining what obligations a grown child has, the “reciprocal relationship” limitation within Special Goods Theory cannot explain why. According to Welch, what is missing from the theory is a clear account of what changes in a relationship effect what filial obligations a child has and why.

Thus, Welch concludes that Special Goods Theory ignores morally relevant considerations, such as what the parent deserves, when surely, it is relevant to determining what one owes one’s parent. Facts about the relationship’s current state form not only what one can do for one’s parent, but also what one’s parent deserves to have done on her behalf. A mother who suffers from dementia does not deserve less because it renders her unable to contribute to the relationship, whereas a father who has unreasonably disowned his son arguably deserves less as a result of his choice to exit the relationship.

6. Gratitude for Special Goods

a. The View

Brynn Welch introduced Gratitude for Special Goods Theory in 2012, arguing that one has obligations of gratitude to provide special goods to one’s parent so long as the following four conditions are satisfied:

The parent needs some special good.
The child can position herself to provide the good.
The parent has provided and/or currently provides special goods to the child.
Expressing gratitude by providing the special good the parent needs would not undermine the mutual respect on which moral relationships are based.

Welch argues that Debt Theory and Special Goods Theory do not respond to the right features of a case—namely, features about the parent-child relationship itself—and Gratitude Theory and Friendship Theory fail to provide sufficient guidance for discharging one’s filial obligations. Gratitude for Special Goods Theory, however, avoids both of these problems.

According to Welch, this theory is responsive to considerations of the parent’s needs, the child’s ability, and what the parent deserves. Furthermore, the theory specifies the action necessary to discharge one’s filial obligations: one ought to provide the special good that the parent needs. Relying heavily on Berger’s and Card’s considerations regarding gratitude, and Keller’s articulation of special goods, Welch offers a blended theory.

Yet, Welch modifies Keller’s condition that the child be able to meet the parent’s need. Consider a situation where a grown child’s career path does not provide the means necessary to pay for her parent’s long-term care. If, however, the child has career opportunities that would make her able to meet her parent’s needs, then she has an obligation to pursue those opportunities (provided all other conditions for filial obligations are satisfied). On Gratitude for Special Goods Theory, in order to avoid filial obligations, the children must be both unable to meet current needs and to position themselves to meet those needs without undermining the mutual respect necessary for moral relationships. This way of modifying Keller’s condition prohibits us from using a narrow understanding of ability.

Furthermore, Welch argues that her theory of filial obligations is superior to Special Goods Theory because it responds appropriately to considerations of what the parent deserves. On Keller’s view, the past provision of goods and the ongoing reciprocal relationship ground the child’s obligation, whereas on Welch’s view, gratitude for the past provision of goods grounds the obligation. The difference, Welch argues, is that gratitude requires that both parties respect one another and themselves, and the provision of special goods is an appropriate expression of that respect. Thus, if the parent has provided special goods in the past but either done so with the expectation of repayment or has at some point come to treat the child as merely a means to an end, the child has no obligations of gratitude since gratitude might undermine rather than support relationships of mutual respect. The child might experience gratitude, but that gratitude is not required and might even be inappropriate.

According to Welch, this theory has several advantages over its predecessors. First, it responds to relevant considerations, such as the parent’s need, the child’s ability, and the past and current state of the relationship. Second, it specifies what we ought to do for our parents. Third, it explains the changes in the parent-child relationship that would change a grown child’s obligations to his parents. Specifically, any changes that undermine mutual respect would diminish or eliminate filial obligations; any changes that restore mutual respect would generate or strengthen filial obligations. Finally, it responds to the demands of justice and offers a moral argument for gender equality in care provisions for ageing parents.

Where earlier theories are concerned with meeting parents’ needs, or that there is repayment as in the case of Debt Theory, Gratitude for Special Goods Theory is concerned with the moral relationships in which these “transactions” take place. Theories focusing only on the goods themselves cannot explain what is wrong with the striking gender imbalance in the provision of these goods. Gratitude for Special Goods Theory can do just that, according to Welch. She argues that a son who either shifts the responsibility of parental care to his sister or wife has already failed to discharge his filial obligations. It does not simply matter that the goods are provided but also who provides them. The son owes his parents gratitude for the past provision of special goods; his wife does not. Similarly, when parental care falls exclusively, or almost exclusively, on female siblings, this indicates that male siblings are failing to discharge obligations. Again, that the parents’ needs are being met does not relieve the male siblings of their obligations of gratitude. Thus, Welch argues this theory is preferable to the previous theories.

b. Criticism

As the theory is the newest in the field, it has not yet received written criticism. Yet, there are two paths such criticism would likely take. First, its second condition—that the child is able to position herself to provide the required good—ignores the likely state of epistemic uncertainty under which the child will have to make decisions. If one is to choose a career based on what one anticipates one’s parents will need, this seems to require that one have access to information about future events. Will my parents live long, healthy lives, or succumb to illness during my college career? Will I be able to rely on my siblings, my spouse, and possibly my siblings’ spouses for assistance? The answers to these questions, and many like them, will shape the content of one’s filial obligations. Yet, they are questions that one cannot possibly know the answer in advance. The condition requires only that one be able to position oneself to provide the required good. The ability to so position oneself, however, depends on facts about the good in question that are likely to be unknown at the time one makes decisions about what career to pursue, where to live, what sort of family structure to create, and so forth. Thus, Welch’s theory seems to imply that children must make choices that will accommodate the maximum range of possible parental needs. Yet, it is unclear what such a choice would be.

A related but different objection is that the theory generates obligations that are too stringent. A grown child could, for example, position herself to provide special goods to her parents by forgoing having children of her own, thus freeing up time and money for her parents that would otherwise be spent on children. Does this person have an obligation to forego having children? Although Welch argues, following Card and Berger, that one is not required to sacrifice one’s serious interests, questions remain about what constitutes a legitimate serious interest. Is an interest in world travel serious enough that one may choose travel over providing special goods to one’s parents? If the answer is yes, then Gratitude for Special Goods Theory may face problems similar to those facing Friendship Theory: the child can simply opt out of filial obligations by cultivating other serious interests. If the answer is no, then Gratitude for Special Goods Theorists seem to be left with only two options: explain the morally relevant difference between one’s interest in child-rearing and one’s interest in world travel, or accept that one’s serious interests do not override one’s filial obligations. If Gratitude for Special Goods Theorists select the first option, they are likely to find themselves constructing a perfectionist account of interests. If they select the second, they are likely to find themselves endorsing overly demanding and counter-intuitive obligations.

7. References and Further Reading

Berger, Fred. “Gratitude,” Ethics, Vol. 85, No. 4 (1975), pp. 298-309.
- Berger offers considerations regarding the role of gratitude in moral relationships, as well as the foundation of a gratitude theory of filial obligation.
Blustein, Jeffrey. Parents and Children: The Ethics of the Family, New York: Oxford University Press (1982).
- The text includes brief descriptions of the obligations parents and children have to one another and possible grounds of those obligations.
Brighouse, Harry and Swift, Adam. “Legitimate Parental Partiality,” Philosophy and Public Affairs, Vol. 37, No. 1 (2009), pp. 43-80.
- The article contains an argument for balancing parental partiality against a concern for fair equality of opportunity. It also discusses the special goods that a parent-child relationship makes possible.
Brody, Elaine. Women in the Middle: Their Parent Care Years, 2^nd edition. New York: Springer (2004).
- Brody’s sociological work contains both quantitative and qualitative data about women who care for elderly parents and young children concurrently
Card, Claudia. “Gratitude and Obligation,” American Philosophical Quarterly, Vol. 25, No. 2 (1988), pp. 115-127.
- The article considers the role of gratitude in moral relationships with an emphasis on many problematic instances of gratitude.
Chappell, Neena L. and Penning, Margaret J. “Family Caregivers: Increasing Demands in the Context of 21^st Century Globalization?” in Cambridge Handbook for Age and Ageing, Malcolm L. Johnson, ed. Cambridge: Cambridge University Press (2005), pp. 455-462.
- This chapter discusses effects of policy changes, economics, and health for caregivers.
Daniels, Norman. Am I My Parents’ Keeper? An Essay on Justice Between the Young and the Old, New York: Oxford University Press (1988).
- The text presents Daniels’ historical and philosophical reflection on intergenerational justice with respect to health care.
Dixon, Nicholas. “The Friendship Model of Filial Obligations,” Journal of Applied Philosophy, Vol. 12, No. 1 (1995), pp. 77-87.
- This article includes a defense of Friendship Theory of filial obligations against initial objections.
English, Jane. “What Do Grown Children Owe Their Parents?” in Having Children: Philosophical and Legal Reflections on Parenthood, Onora O’Neill and William Ruddick, eds. New York: Oxford University Press (1979), pp. 351-356.
- The chapter presents the Friendship Theory of filial obligation.
Fitzgerald, Patrick. “Gratitude and Justice,” Ethics, Vol. 109, No. 1 (1998), pp. 119-153.
- This article discusses how obligations of gratitude extend to politics.
Hardwig, John. “Is There A Duty to Die?” from Hastings Center Report 27, No. 2 (1997), reprinted in Ethical Issues in Modern Medicine: Contemporary Readings in Bioethics, 7^th edition. Bonnie Steinbock, John D. Arras, and Alex John London, eds. New York: McGraw-Hill (2009), pp. 511-520.
- This article argues that older individuals, namely those who will experience financial and emotional hardship from continued care, might have a societal obligation to die.
Ivanhoe, Philip J. “Filial Piety as a Virtue,” in Working Virtue: Virtue Ethics and Contemporary Moral Problems, Rebecca L. Walker and Philip J. Ivanhoe, eds. Oxford: Clarendon Press (2007), pp. 297-312.
- Here, Ivanhoe articulates piety as the appropriate attitude to have toward one’s parents.
Jecker, Nancy S. “Are Filial Duties Unfounded?” American Philosophical Quarterly, Vol. 26, No. 1 (1989), pp. 73-80.
- In this article, Jecker discusses whether filial obligations can arise at all.
Jecker, Nancy S. “Taking Care of One’s Own: Justice and Family Caregiving,” Theoretical Medicine, Vol. 23 (2002), pp. 117-133.
- Here, Jecker discusses justice, gender, and filial obligations.
Keller, Simon. “Four Theories of Filial Duty,” The Philosophical Quarterly, Vol. 56, No. 223 (2006), pp. 254-274.
- This article introduces and articulates the Special Goods Theory of filial obligations.
Keller, Simon. The Limits of Loyalty, Cambridge: Cambridge University Press (2007).
- Here, Keller discusses appropriate versus inappropriate loyalty. Also, theories of filial obligation are discussed at length.
Kupfer, Joseph. “Can Parents and Children Be Friends?” American Philosophical Quarterly, Vol. 27, No. 1 (1990), pp. 15-26.
- Kupfer criticizes Jane English’s Friendship Theory, emphasizing the inequality of autonomy within the parent-child relationship and the lack of sufficient distance between the parent and child.
Narveson, Jan. “On Honoring our Parents,” Southern Journal of Philosophy, Vol. 25, No. 1 (1987), pp. 65-78.
- Here, Narveson argues that we ought to care for our parents because we want to make it rational for others to have and rear children.
Okin, Susan Moller. Justice, Gender, and the Family, United States: Basic Books (1989).
- Okin discusses the causes and effects of gender inequality.
Sommers, Christina Hoff. “Filial Morality,” The Journal of Philosophy, Vol. 83, No. 8 (1986), pp. 439-456.
- This article argues that we ought to provide care for our parents because failure to do so violates the parent’s legitimate expectations.
Welch, Brynn. “A Theory of Filial Obligation,” Social Theory and Practice, Vol. 38, No. 4 (2012), pp. 717-737.
- Welch discusses Gratitude and Special Goods Theory, and articulates and defends Gratitude for Special Goods Theory.

Author Information

Brynn F. Welch
Email: bwelch@uab.edu
University of Alabama at Birmingham
U. S. A.

The Geometrical Method

The Geometrical Method is the style of proof (also called “demonstration”) that was used in Euclid’s proofs in geometry, and that was used in philosophy in Spinoza’s proofs in his Ethics. The term appeared first in 16^th century Europe when mathematics was on an upswing due to the new science of mechanics. Before that, geometry had been taught as a merely theoretical discipline without being connected to natural philosophy. In contrast, natural philosophy had been based on observation, experiment, and speculation, not at all on mathematics. Galileo, though, saw the connection; he envisioned nature as a book written in mathematical signs and thus he emphasized the study of mathematics to understand nature. His initial quest for the mathematization of nature was continued by Descartes. Descartes asked for the cultivation of a new sort of geometry that would no longer be a mere abstract enterprise but could explain the phenomena of nature.

Although the use of the Geometrical Method and of mathematization more broadly became the success story of modern sciences, it faced resistance from those who believed its use led to the disenchantment of the world and the vanishing of miracles. The opponents often accused modern philosophers of haughtiness if they applied the Geometrical Method. Galileo was blamed for claiming an equality between human knowledge and God’s knowledge, at least in geometrical things. Galileo had stated that whatever we humans could demonstrate geometrically could not be known any better by God because it was necessarily true. Moreover, the constraint of geometrical demonstrations, extended to real things in nature and society, even to human beings, opened questions about the freedom of the human will, stirring up philosophical and theological debates, lasting to some extent even into our own days.

The Geometrical Method
The Essential Significance of Definitions
Adequate Ideas and A Priori Knowledge We Share with God
The Place of Empirical Knowledge in the Geometrical Method
Geometrical Method and Logic of Containment
The Mathematization of Nature as a Challenge of Necessitarianism
Conclusion
References and Further Reading
1. Abbreviations
2. Bibliography

1. The Geometrical Method

When we think of the Geometrical Method today, we usually associate it with what we see when we open a book of Euclid, or (if we are looking for its use in philosophy) what we see in Spinoza’s Ethics. Instead of a coherent flow of text, the lines are broken up into different types of text: definitions, axioms, postulates, propositions, and demonstrations. As we all learn in school, a geometrical demonstration has to start from definitions of things, which are supposed to allow for the deduction of conclusions about properties of the defined things because these properties are already (virtually) involved in the definitions. A common example is the definition of a triangle. Here it follows necessarily from its definition—being composed of three straight lines—that its angles sum up to 180^o. To be sure, this definition is true for all triangles in Euclidean geometry necessarily and thus with absolute certainty. Geometrical demonstrations also use axioms, being statements that everybody will admit as self-evidently true, and postulates, statements which are hypothetically claimed as long as nobody objects. Both axioms and postulates are considered permitted additions to definitions that allow for a geometrical demonstration in which it is shown how the conclusions necessarily follow from the definitions.

This way of demonstration has been known since ancient Greek mathematics, mostly through Euclid’s Elements. However, the term “Geometrical Method” only came up much later, in early modern times. Jacobo Zabarella, who wrote in late 16th century in Padua, described this method as involving two aspects, namely the resolutive and the compositive, also known as the analytic and synthetic side of the Geometrical Method (Cassirer 1974, I, 136-44). While the analytic part is considered to be helpful for discovery and invention of new truths, the synthetic is appreciated for ensuring the certainty of the results due to a complete deduction of propositions from definitions and axioms, that is, by geometrical demonstration. It was the synthetic method that provided the compelling force for the argument being thus capable of convincing others of the correctness of a proposition. Leibniz emphasized the eminent significance of such a demonstration when he referred to Euclid. The Greek mathematician had been mocked for his cumbersome demonstration of something even children could easily see, namely that two straight lines cannot surround a space and that they can only share one point. But Leibniz praised Euclid for demonstrating this anyway because he did not make the demonstration to know it but to know it with certainty (A VI, 1, N. 125, 469).

Pascal, in a text which came down to us as an inclusion in the Port-Royal Logic of Antoine Arnauld and Pierre Nicole, provided a comprehensive description of the synthetic aspect of the Geometrical Method, which he then again broke down into two major demands: not to employ any term in a demonstration that had not yet been defined and not to accept any proposition which had not yet been demonstrated from defined terms or demonstrated propositions (Pascal 2000, 155-6). While only Spinoza, notoriously, explicitly uses the synthetic method in his major work Ethics, the rationalist authors follow this very method when presenting their arguments. They begin with definitions and deduce their entire argument from them (for example, A IV, 1, N. 1). It should be noticed that all rationalists were advanced mathematicians, although only Descartes and Leibniz were mathematical geniuses.

The strict demands of the Geometrical Method opened the space for discussion not only about the status of axioms not being demonstrated but also about that of definitions. Did definitions depend on human choice of words or did they have to express the essence of the defined thing? If the latter, how could we know the essence of a thing, and if the former, how would arbitrary definitions lead to truth? While the constraint of geometrical demonstrations, that is, their convincing force, could never be questioned once the definitions were admitted, it was the concept of definition and to a smaller extent that of axiom which moved to the center of the discussion about the Geometrical Method.

While providing absolute certainty, the synthetic aspect of the Geometrical Method had also disadvantages. Due to the rules not to employ any concept before defining it and not to use any proposition before demonstrating it, the way of presenting an argument had to follow the course in which these definitions and propositions could be demonstrated, which often interrupted the natural course of the argument. Also, the apparatus of definitions, axioms, postulates, propositions, and their demonstrations was quite cumbersome. Finally, the striving for unequivocal expressions did not allow for metaphors, ironies, or jokes and thus lacked entertaining qualities.

It is very common to associate the synthetic or compositive aspect with the Geometrical Method and to neglect the analytical side. However, scientists and mathematicians have always been more interested in the analytic aspect of the Geometrical Method because they aimed to discover new truths. In using the analytic or resolutive method, they did not even care much about a gapless deduction (Breger 2008, 191-2). Rather, they trusted their intuition, based on their intensive foregoing studies and deep knowledge about their subject. Philosophers also used the analytic side of the Geometrical Method, and Descartes even preferred it in his writing, stating that he wanted to write following the path in which he found the truth rather than presenting it by geometric demonstration (AT VII, 211-3; CSM II, 110-1). Spinoza, according to Tschirnhaus’ reports to Christian Wolff (Wolff 1980, 124-7; Corr 1972, 323-34), developed the analytic method in order to find and constantly improve definitions, using experiments and observation. He started with mere nominal definitions for things insufficiently known and replaced them (or parts of them) by causal definitions in the course of progress in his ability to produce the effects (Goldenbaum 2011, 29-41). Tschirnhaus, who was above all a mathematician and an engineer (he invented, for example, Meissen porcelain), further developed this method of defining and redefining objects of natural science based on empirical research. Christian Wolff used this method systematically to reduce the gap between a priori knowledge and experiential knowledge. When, for example, he wrote about methods to increase the growth of grain, he distinguished between facts we know from experience and the causes of some phenomena we know with certainty and have thus under control (Wolff 1734; Goldenbaum 2011). Although we cannot know the essence of the plants yet, we can come to know some causal processes of the growing of plants and thus can even predict the outcome with a high degree of certainty.

It was the goal of the analytic part of the Geometrical Method to improve the definitions of real things—not only of geometrical figures. Due to the negligence of the analytic aspect of the Geometrical Method, the understanding of definitions in the framework of the modern Geometrical Method is often insufficient.

2. The Essential Significance of Definitions

Although it was the geometrical demonstrations that guaranteed necessary truths, they were hardly under attack. Instead, it was the definitions and, to a smaller extent, the axioms that moved to the center of the philosophical discussion because they were the starting point of the Geometrical Method—in particular, of its synthetic part. Surprisingly, partisans and critics agreed about the essential significance of the definitions.

Of course, axioms also became a subject of criticism by the opponents of the Geometrical Method because, traditionally, they were not demonstrated but assumed to be evident. Critics argued that a demonstration built upon undemonstrated axioms could not guarantee the truth of the demonstrated proposition. Hobbes rose to this challenge, arguing that all axioms could actually be demonstrated as soon as anybody would doubt them. Spinoza and Leibniz agreed. As an example, Hobbes and Leibniz demonstrated the axiom that had become disputed at the time, namely that the part is smaller than the whole (OL I, 105-6; De corpore II, 8, sec. 25; Leibniz A II, 1 2006, 281; A VI, 2, 480). As a result, Hobbes (OL I, 252-8; De corpore III, 20, sec. 6) and, following him, Leibniz (Leibniz, A VI, 1, N. 12; A II, 1 2006, N. 24, 153) conceived geometrical demonstrations as mere chains of definitions (axioms or postulates being capable of demonstration, if doubted). According to Leibniz, the only true axioms were identical propositions that could not be demonstrated.

But it was the concept of definition which bore the brunt of the attacks throughout the 17th and 18th centuries. Critics insisted that an extension of the Geometrical Method to real things would be impossible because we could not give any real definition of any real thing, in sharp contrast to real definitions of geometrical subjects, which we could provide. Since geometrical figures were created by humans, we could know their essence. Because real things were created by God, or at least not by human beings, their essences remained unknown to us, due to our finite minds and moreover to our fall. The same criticism can still be found in Locke and Kant.

Traditionally, there existed a general distinction between nominal and real definitions going back to Aristotle’s Organon (Anal. Post. II, 7-10). Even the new Cartesian Port-Royal Logic (L’art de penser), written by Arnauld and Nicole, kept this traditional distinction (Arnaud/Nicole 2011, 325-31; Logique de Port-Royal I, 12). While a nominal definition was nothing but words by which we named things, either by convention or by custom, without knowing the essence of the thing, a real definition would allow us to know whether the defined thing was real or at least possible in reality. Real definitions were usually supposed to be possible in mathematics, due to their human production, but also in theology, at least for the notion of God, although the latter was increasingly doubted. Pascal, for example, after his religious turn, did not accept any but nominal definitions because human beings were unable to know any real definitions (Pascal 2000, 156). In his view, we could define things as we liked, arbitrarily, and therefore there could never be a cause for serious contradiction but in mere words. This radical position, rejecting any role of reason for religion, was strongly contradicted by Arnauld and Nicole, who defended the real definitions in their Port-Royal Logic.

Here again, Hobbes took on the challenge and developed a new approach to real definitions. The way he does this sheds quite some light on how the new Geometrical Method of early modern time was indeed new, namely, infected by the new science of mechanics. Hobbes connected the issue of definitions with Galileo’s mechanics (Jesseph 1999, 117-25). Considering geometrical figures as produced by mechanical motion (already done by the mathematician Roberval), he understood them as effects caused by mechanical motion. A definition which included the cause of the thing to be defined showed at the same time that it was possible. In this way, it provided the opportunity to deduce any possible property of the thing, that is, even of those properties we are not yet aware of. A circle, for example, is produced by the mechanical motion of one endpoint of a straight line around the other endpoint. All the possible properties of a circle can be deduced from this causal definition, necessarily.

But Hobbes then introduced this new mechanical approach to definitions into philosophy and demanded such causal definitions (or genetic definitions) in philosophy too, in order to produce necessary conclusions about reality. Indeed, he uses the term “philosophy” (or “science”) exclusively for causal explanations of phenomena, starting from causal definitions (OL I, 62-65; De corpore I, 6,6). According to Hobbes, just as within geometry, a definition that includes the mechanical cause of the thing to be defined can serve in any field of science to deduce all the properties of the thing (OL I, 71-3; De corpore I, 6, 13). Hobbes thus transforms the Geometrical Method into a general epistemological principle: what we can generate, that is, cause, we can know with certainty, in its essence, or—with necessity. That is the reason why he can claim that we can even come to know the political state by philosophy, that is, in a scientific way, namely through causal explanation—because it is produced, generated, or caused by human beings.

Hobbes’ innovation of causal definitions was adopted (together with the Geometrical Method) by Spinoza (Spinoza 1985, 31-2), by Leibniz, and by Christian Wolff (Cassirer 1974, II, 521-5; Goldenbaum 2011). Leibniz discusses the traditional distinction of nominal and real definitions as still taught in the Port-Royal Logic. According to his explanation, nominal definitions result from our clear and distinct perception of things and their properties which we can name. Such nominal definitions allow us to distinguish these clearly and distinctly perceived things from other things. Confused ideas, though, where we cannot give single properties although we somehow recognize a thing in its entirety, do not allow yet for any definition. They may be made more (and more) distinct by analysis though, that is, by further distinguishing their parts (On Synthesis and Analysis, Loemker 229-34; A VI, 4, N. 129).

In contrast to such nominal definitions, being a name for a mere listing of properties, Leibniz defines real definitions as including and displaying the possibility of the defined thing, that is, freedom from contradiction (Loemker 231; A VI, 4, N. 129, 542). His example is the definition of a circle—that is, Euclid’s definition of a circle as produced by the motion of a straight line in a plane around one of its endpoints. This definition, being clearly a causal definition (christened so by Hobbes), is for Leibniz a real definition in an exemplary way because it displays the demanded possibility of its subject. But Leibniz does not even mention any other type of real definitions (Loemker 230-1; A VI, 4, N. 129, 541)

Moreover, Hobbes, Spinoza, and Leibniz, all extended the scope of causal definitions further, arguing that not only those definitions that include the actual cause of a thing but any definition that includes a cause capable to bring about the thing to be defined, can serve as its causal definition. If we can generate a thing, it is at the same time shown that it is possible. Surprisingly, Leibniz uses this extended concept of the causal definition to develop his modern concepts of hypothesis and of truth. He writes, “to set up a hypothesis or to explain the method of production is merely to demonstrate the possibility of the thing” (Loemker 231). That is, for Leibniz, a hypothesis that can explain a possible generation or causation of a thing shows its possibility and is capable of deducing all the properties of the subject of the hypothesis even if it will never come into reality.

All rationalists using the Geometrical Method intended to use it beyond geometry, making the generability of a thing through human beings the new approach to science which also changed the approach to empirical investigation. They all had a strong awareness that knowledge starting from causal definitions could provide necessary knowledge, that is, a priori knowledge about things beyond geometry. It is seldom noticed that exactly this position is already held by Galileo: “all these properties [of things in nature] are in effect virtually included in the definitions of all things; and ultimately, through being infinite, are perhaps but one in their essence and in the Divine mind” (Galilei 1967, 104).

3. Adequate Ideas and A Priori Knowledge We Share with God

The mathematician and rationalist Descartes did not yet talk of causal definitions. But in his reply to Arnauld about his fourth meditation (AT VII, 220; CSM, II, 155), he describes something he calls an “adequate idea,” which is precisely what is described as a causal definition by Hobbes. Just like causal definitions, adequate ideas have the capacity to virtually include all properties that belong to the cognized/defined thing. The term “adequate ideas” is more familiar to us from Spinoza and Leibniz. Descartes uses it indeed rarely and only with greatest caution: he does not ascribe adequate ideas to human beings but to God exclusively. According to Descartes, only God, knowing everything, can be assured to know whether an idea indeed contained all the properties of the thing. In contrast, while human beings may know all properties of a thing, they can only be sure of its completeness by a special revelation of God.

Moreover, not only could God have created things in a different way, even mathematics could have been shaped differently if God had willed so (AT I, 145, 149-50; CSMK III, 23-4). While this statement caused headaches and criticism among rationalists such as Spinoza and Leibniz, they all yet understood Descartes as a partisan of the Geometrical Method. They admired his insistence on intuition and deduction as the only way to certainty in knowledge, that is, to a priori knowledge. And indeed, while giving up about our reach to adequate ideas, Descartes does introduce the notion of a complete idea being available to human beings. And such complete notions would contain virtually all the properties of the ideatum, making it look like an adequate idea, with the only restriction that only God could know if it was indeed complete in respect to all consequences.

Descartes’ cautious distinction between adequate and complete ideas will not be upheld by his followers. For Spinoza, it is precisely our adequate ideas, which we share with God’s intellect, that allow for certainty of our knowledge (Spinoza 1985, 474-8; EII, p.37-p.40s2) as well as for overcoming our lack of freedom. Adequate ideas will even make our mind eternal (Spinoza 1985, 613-7; EV, p.38-42s). Spinoza defines “adequate idea” as “an idea which, insofar as it is considered in itself, without relation to an object, has all the properties, or intrinsic denominations of a true idea” (Spinoza 1985, 447; EII, d4). Thus, he explicitly denies correspondence of an idea with an external object as a criterion for adequacy and thereby denies the traditional understanding of adequacy in Aristotelian scholastics as agreement or correspondence of idea and ideatum. For Spinoza, to have an adequate idea is to provide the proximate cause of the thing to be known or to define a thing by its cause. That is, he introduces the adequate idea as causal definition or deduction from causal definitions.

Even Leibniz, the committed Christian philosopher, accepted a human capability for adequate ideas. He agreed that if we know things adequately, we know them with the same certainty by which they are known by God. Such an adequate idea is given whenever the thing can be completely analyzed into its simple primitive concepts, which is precisely the case in geometrical causal definitions. Leibniz praised adequate ideas for their special capacity that from them “all truths [can be demonstrated] with the exception of identical propositions, which by their very nature are evidently indemonstrable and can truly be called axioms” (Loemker 231; A VI, 4, N. 129, 542). Just like Spinoza, Leibniz connects adequate ideas with causal definitions because such definitions, in contrast to nominal definitions, immediately display the possibility of the defined thing, without any experiment or observation: “Obviously, we cannot build a secure demonstration on any concept unless we know that this concept is possible … This is an a priori reason why possibility is a requisite in a real definition” (ibid.).

It is precisely from this Geometrical Method that Leibniz arrives at his containment logic, stating that a reason can be given for each truth “for the connection of the predicate with the subject is either evident in itself as in identities, or can be explained by an analysis of the terms. This is the only, and the highest, criterion of truth in abstract things, that is, things which do not depend on experience—that it must either be an identity or be reducible to identities” (Loemker 232; A VI, 4, N. 129, 543). From here, Leibniz states that the elements of eternal truths can be deduced and a method provided for everything if they are only cognized as demonstratively as in geometry. Of course, God cognizes everything in this way, even concrete things, that is, a priori and “sub specie aeternitatis”—because He does not need any experience. While He knows everything adequately and intuitively, we can grasp hardly anything in this way and have to rely for most things on experience.

It is interesting that in Wolffianism, when it comes to German translations, the term “idea adaequata” is bluntly translated as “complete idea” [“vollständiger Begriff”] (Spinoza 1744), thereby ignoring Descartes’ cautious distinction between complete ideas available to human beings and adequate ideas available to God. However, while all rationalists agree that human beings can know a certain number of necessary demonstrations and to that extent have adequate ideas, that is, a priori knowledge equaling divine knowledge (the latter claim not being shared by Hobbes), this view is moderated by their awareness that such a priori knowledge is extremely limited in human beings and has therefore to be supplemented by experience. Galileo, Descartes, Spinoza, and Leibniz, all admit a difference between divine and human knowledge—a difference consisting in God’s thoroughgoing intuitive knowledge of all things in contrast to human discursive knowledge of very few things. Still, a few intuitive insights were available to human beings too (Galilei 1967, 103-4; see also AT X, 409 (Reg. XI)). However, they stated the special character of this kind of knowledge which we shared with God, its absolute certainty due to its a priori character.

Adequate ideas are thus, from Descartes via Spinoza to Leibniz, ideas which provide a complete and absolutely certain knowledge of all the properties of their subject, independent of any knowledge of correspondence, that is, of sense perceptions. Although we can only reach a small amount of adequate ideas, this kind of knowledge is absolutely certain, a priori, that is, necessary and thus equal with divine knowledge. Causal definitions as the central part of the new Geometrical Method were crucial to obtaining such adequate ideas. It is this kind of knowledge which distinguishes us from animals. According to Hobbes, Spinoza, and Leibniz, it goes without saying that animals could think. But they could only think in an empirical way, by observation, trial and error, or by induction. They absolutely lacked necessary or a priori knowledge which we humans shared alone with God. Only human beings had the capability to have adequate ideas, a priori knowledge they shared with God.

4. The Place of Empirical Knowledge in the Geometrical Method

While God knows everything adequately and intuitively, we humans rarely get to know adequate ideas intuitively. Therefore, all rationalists agreed that in acquiring knowledge we usually need to rely, not just on intuition, but also on empirical knowledge. It is a widespread prejudice, due to German Idealism, that rationalists were not interested in empirical studies [see Continental Rationalism], but Descartes and Spinoza themselves performed experiments, and they all were highly interested in the scientific experiments of their time. They took, however, a very different approach to empirical studies than did the empiricists.

Although we are able to know only a few things with absolute certainty, what we are able to know in that way provides us with a fixed framework to order and interpret empirical data. Because “the fixed and eternal things” (Spinoza 1985, 41) that we know a priori are closely connected to the particular concrete things of which only God has adequate ideas, the necessary knowledge we have will help us to order our empirical data. Because these eternal abstract truths can never contradict any predicate of a complete notion or adequate idea of a concrete thing, they can provide a strong framework for our empirical work, which is available to our finite minds. When we come to learn about new facts by experience and by history, we can expect these single facts to fit into the theoretical framework such as the pieces of an unfinished puzzle, and build more and more a complete notion of an individual and its action. Therefore, Descartes, Spinoza, and Leibniz strongly recommend the development of empirical sciences that combine a priori knowledge with experiment in mixed sciences, supposed to enrich human knowledge.

Of course, this process of learning can never be conclusive because it is infinite due to the infinite properties of concrete particular things, or individuals. Nevertheless, our expectation that things in the world are coherent (based on the conviction of a theoretical framework that is adequately known by God, a priori, and thus must exist), together with the available specific notions of abstract things we as human beings can reach a priori, provide powerful tools. It is as if we had an unfinished map, a compass, and a watch that, with our general framework of terrestrial geography, can guide an expedition into an unknown area. Such equipment can help us to recognize coherence and causal interconnectedness in the otherwise confusingly rich abundance of single facts of empirically obtained knowledge. Therefore, Leibniz’s, Spinoza’s, and Hobbes’ approach to empirical research is completely different from any empiricist approach to nature or history. Empiricists claim to collect facts in order to check for common patterns or similarities and then to abstract rules or laws from them. If appropriate, mathematics could be applied to these abstractions. But no cognition reached by such a process could ever provide certainty, and it must remain provisional due to the general weakness of the fallen men.

This distinction between rationalists and empiricists in respect to empirical studies becomes plain in Spinoza’s criticism of Boyle, who saw his experiments as demonstrating mechanical corpuscular philosophy. Instead, Spinoza argued Boyle’s experiments would fit a hypothesis which he had held before and which had to be justified by its inner coherence alone while it could not be proved by any experiment (Spinoza 1985, 173-88, esp. 178). Leibniz quotes this statement with agreement in his argument with Locke (A VI, 6, N. 2, 454-5; Leibniz 1996, 455; IV, 12, 13). Curley contradicts the view that Spinoza ignored empirical research (Curley 1986a, 156), and indeed, Spinoza even demands a theory of experimentation (Spinoza 1985, 42). We also have evidence from his correspondence that Spinoza experimented himself.

But it is especially Leibniz’s modern concept of hypothesis that can explain the empirical project of rationalism. To recall, to state a hypothesis is to state the way of generation whereby the possibility of a thing can be proved. For Leibniz, this is even valid if parts of such a hypothesis cannot yet be perceived distinctly and can only be supposed, that is, if the hypothesis is a hybrid of causal definition and empirical facts. While such a hypothesis is valid only by presumption of the truth of our empirical knowledge, it has to be coherent in itself and can count as demonstrated to the extent it fulfills this criterion. If there exist competing hypotheses for the explanation of natural phenomena, as in the case of the hypotheses of Ptolemy, Tycho de Brahe, and Copernicus, one has to choose the most intelligible hypothesis as true or closest to truth, which also coheres the most with all known phenomena.

What is already implicit in the 1680s becomes plain in the 1690s, that truth for Leibniz is nothing but the intelligibility of a hypothesis, that is, a complex causal definition. Truth is nothing we can state by checking the correspondence of our ideas with reality, as claimed by empiricists. Such a check is indeed impossible. Instead, adequate ideas are true in themselves, and their truth can be determined alone by their own property to be free of contradiction. This alone makes them intelligible and thus possible. That is not only valid for mathematics, but as well for causal definitions or hypotheses about real things. Just as Hobbes had declared, Leibniz argues: All we can generate or cause, or of which we can provide a possible way of generation or causation, is intelligible and knowable by human beings in adequate ideas.

While today we are used to distinguishing between natural sciences as hard-core science (such as physics, chemistry, biology, or, increasingly, medicine), on the one hand, and humanities and social sciences, on the other, Hobbes, Spinoza, and Leibniz instead distinguished demonstrative from empirical knowledge. To the extent that empirical knowledge could be organized in explanatory and coherent hypotheses explaining natural phenomena by mathematical science, it could be turned into a gradually demonstrative science. For Leibniz, even human history and the humanities could be turned into sciences in this way, being not really different from natural sciences in their searching for a coherent explanation of empirical, contingent truths. As soon as they could come up with a theoretical framework of a priori eternal truths available to us through the Geometrical Method, they could become science.

5. Geometrical Method and Logic of Containment

Leibniz, embracing the Geometrical Method, was fully aware of his dangerous intellectual neighbors (Hobbes and Spinoza), and worked hard to secure his metaphysics against strict determinism or necessitarianism in order to distinguish his metaphysical and epistemological project from these bad bedfellows. He had been working on this since he studied Hobbes and Spinoza in Mainz between 1670 and 1672. The result is his well-known distinction of necessitating versus inclining in paragraph 13 of the Discourse on Metaphysics written in 1686 (Loemker 310-1; A VI, 4, N. 306, p. 1546). But, notwithstanding his obvious rejection of Hobbes’ and Spinoza’s strict determinism, Leibniz clearly shares the new Geometrical Method, as a philosophical method, with the infamous philosophers, the method which was constantly accused of necessitarianism if extended to real things. Moreover, it is this new method based on the causal definition that provides the basis of Leibniz’s logic of containment (Di Bella 2005, 80-95).

Leibniz approaches the challenge by distinguishing abstract and concrete things as subjects of our ideas. While only God can have a priori knowledge of the complete notions of concrete things or individuals, we can at least have a priori knowledge of abstracta as, for example, geometrical figures because they are finite in their properties. Also, what is true for one kind of abstracta, as for example a triangle, is true of all members of that kind, for example, for all triangles. In contrast, because concrete things or individuals have infinitely many properties and are the only member of their kind, we as finite beings cannot reach their complete concepts and have to rely on empirical knowledge too when it comes to individuals (Loemker 331-8; A II, 2, N. 14). This distinction, closely related to the distinction of necessary and contingent truths, allowed Leibniz to distinguish human and divine knowledge by a qualitative criterion. Moreover, it also provided a criterion to distinguish contingent from necessary knowledge, thereby paving the path for human and divine freedom. This solution gave Leibniz sufficient confidence to present at least the headings of his Discourse on Metaphysics to the Jansenist theologian and Cartesian Arnauld in 1686, with the long sec. 13 being especially provocative in respect to free will. Clearly, at this time, Leibniz had worked out his new metaphysics (based, however, on the problematic new Geometrical Method), which would make modern science compatible with Christian dogmatics and especially allow for free will by a softened determinism.

However, in spite of Leibniz’s strong emphasis on the different ontological status of abstracta versus concreta and of necessary versus contingent truths to secure contingency and to block strict determinism, he always maintained the containment theory based on Geometrical Method. According to this view, in every true proposition, the predicate had to be included in the subject. This position clearly retains a general similarity between the two kinds of concepts because both—specific (or full) concepts of finite abstract things as much as complete concepts of concrete infinite individuals—must include all their predicates and can be known a priori by Him who generated them. This view is precisely the core of the Geometrical Method! According to Leibniz, even if human beings cannot know individuals a priori but only through empirical study or by history, God does know the complete concepts of individual substances a priori which thus exist, the subject containing the predicate.

It was this theory that would lead to paragraph 13 of the Discourse of Metaphysics, according to which the complete concept of any individual was known by God and would include every single event that would ever happen to us. When God created this world, He chose those individuals who belonged to the best of all compossible worlds. Because of that choice, led by God’s intellect, there cannot be any contradiction among the things of one world, or rather of their concepts. What is crucial here is that Leibniz’s approach to contingent things assures us—from the very beginning—of the inner coherence of all phenomena of this world that will ever occur to our experience even if we cannot see it yet. Because there is nothing arbitrary in God’s creation—nihil sine ratione—we can take it for granted that there is a universal coherence of the world in spite of our own limited approach. It is within this view that Leibniz sharply deviates from Luther and the Protestant way of thinking in which such an intelligibility of the world to humans is bluntly denied, due to the fall. It is this view that makes him a true optimist, being convinced of the intelligibility of the world—even if we will never exhaust it.

6. The Mathematization of Nature as a Challenge of Necessitarianism

The use of the Geometrical Method in philosophy had often been criticized, long before Kant argued against it (Kant 1998, 630-43; 1^st Cr. A713/B741-A738/B766). One objection was that the Geometrical Method should be restricted to geometry and could not be used in any other field. At first glance, this seems quite convincing. Given the cumbersome outlook of a text written in Geometrical Method, as for example Spinoza’s Ethics, it seems obvious that this method makes understanding of the argument rather more difficult. The complicated system of references to former demonstrations constantly interrupts the argument; Spinoza’s addition of so many scholia wherein he explains the context and the aim of his demonstrations in common language displays his awareness of this problem.

But the objections against the Geometrical Method were more fundamental. What the partisans of the Geometrical Method saw as its greatest advantage in contrast to any other knowledge—the necessary conclusions and thus certainty, was considered the greatest danger by its critics. One of the reasons for such protests was obviously the theological concern about human haughtiness as it was expressed already in the accusation against Galileo. He was blamed for claiming an equality between human knowledge and that of God, at least in geometrical things (Galilei 1907, vol. 19, 326-7). Indeed, Galileo stated that what we could demonstrate geometrically could not be known any better by God because it was necessarily true: “I say that as to the truth of the knowledge which is given by mathematical proofs, this is the same that Divine wisdom recognizes” (Galilei 1967, 103; my emphasis-UG). The concern about human haughtiness was not restricted to the Catholic Church, it would also cause worries among Protestants, for example, for the Cambridge Platonists, very influential to Locke and Newton, who both rejected the Geometrical Method. In Germany, it became one of the major arguments of the Lutheran theologians and philosophers against Leibniz and Christian Wolff (Goldenbaum 2004, 48-58; 195-208).

But it was not the traditional method of Euclidian geometry that caused the massive criticism of the new Geometrical Method. Rather it was its close connection to the mathematization of nature and thereby the extension of geometry from a small discipline without practical relevance to reality, making it the science of the world. Galileo had opened the new path of modern science by using the Geometrical Method for the investigation of physical phenomena, and he was deeply convinced that nature itself is structured mathematically. In this way, he found the law of falling bodies as well as the parabola as the trajectory of thrown bodies; neither of them could have been found by mere observation or experiment. Galileo’s enthusiasm that mathematics would allow us to understand the inner structures of nature is most clearly expressed in his famous saying:

Philosophy is written in that great book which ever is before our eyes—I mean the universe—but we cannot understand it if we do not first learn the language and grasp the symbols in which it is written. The book is written in mathematical language, and the symbols are triangles, circles and other geometrical figures, without whose help it is impossible to comprehend a single word of it; without which one wanders in vain through a dark labyrinth. (Galileo 1960, 183-4)

Descartes followed Galileo and asked for the cultivation of a new sort of geometry that would no longer be a mere abstract enterprise but could explain the phenomena of nature (AT II, 268; CSMK III, 118-9). There shall be only one science, mathesis universalis, by which the observed natural phenomena could be explained from their inner essences and thus necessarily. The great admirer of Galileo, Thomas Hobbes, extended the Geometrical Method to politics, claiming that his political philosophy was the beginning of political science. Spinoza even extended the Geometrical Method to ethics and delivered a theory of human affects showing the necessity by which they would occur whenever certain circumstances came together. That is how he could state: “Therefore, I shall treat the nature and powers of the Affects, and the power of the Mind over them, by the same Method by which, in the proceding parts, I treated God and the Mind, and I shall consider human actions and appetites just as if it were a Question of lines, planes, and bodies” (C 492; Preface to EIII).

All these thinkers extended the Geometrical Method beyond mathematics, claiming its value for the investigation of realia, of real things instead of mere geometrical figures. Such extension of the Geometrical Method to real things was done with the goal to produce certainty of knowledge, a certainty guaranteed by the necessity of geometrical demonstrations. But if it would indeed lead to necessary demonstrations about nature, politics, and ethics, it would introduce necessitarianism into natural, social, and moral sciences, and space would not be left for miracles and, even worse, for free will. This can be seen in the cases of Hobbes and Spinoza, who both were strict determinists. In contrast, it was precisely the recognition of this threat of determinism or necessitarianism implied in the Geometrical Method that led Henry More very early to his criticism of Descartes and since the 1660s to his massive rejection of Cartesianism (More 1711, 58). Besides the theological concern about human haughtiness, it was the threat of necessitarianism that was the true source of the lasting protest against the Geometrical Method throughout the 17th and 18th centuries.

What caused the most trouble about the Geometrical Method in 17th and throughout the 18th centuries was neither its ponderous way of thinking nor its lack of success. Rather it was the turmoil about human haughtiness and the threat that its determinism would destroy free will of God as well as that of human beings. Exemplary for the different approaches to God’s free will is still the correspondence of Leibniz and Clarke. According to Leibniz, nothing can happen without a sufficient reason—and this just proves the existence of a God who—in His perfection—could not have chosen an arbitrarily functioning world. Clarke (and Newton), on the other hand, counts any act of an arbitrary will on God’s part as a sufficient reason (Leibniz 2000, 7 and 11).

7. Conclusion

Two things caused deep anxiety and anger regarding this method: (1) the attempt to extend the Geometrical Method to nature, to humans, and to society (taking mathematization of nature for granted), providing human beings with a godlike a priori knowledge beyond mathematics, even if limited; and (2) the threat of determinism. These threats forced theologians and Christian philosophers to reject rationalism and the Geometrical Method altogether. In sharp contrast to rationalism, Locke would even deny the possibility of any natural science because we could not have any real definitions beyond mathematics and morals:

This way of getting and improving our Knowledge in Substances only by Experience and History, which is all the weakness of our Faculties in this State of Mediocrity, which we are in this World, can attain to, makes me suspect, that natural Philosophy is not capable of being made a Science. We are able, I imagine, to reach very little general Knowledge concerning the Species of Bodies, and their several Properties. (Locke 1975, 645; Leibniz 1996, 453; IV, 12, 10)

Kant would declare that there would “never be a Newton for a blade of grass” (Kant 2000, 268-71; 3^rd Crit., 75 (B338)), pointing us instead to design theory in biology admitting causal explanations alone for mathematics and mechanics, that is, applied mathematics. Both approaches were applauded by Protestant theology (Goldenbaum 2004, 48-58).

Thus, the opposition between the two philosophical camps of rationalism and empiricism was not the result of different approaches to experience as is often claimed. Rather, it was their different and opposing stances toward the Geometrical Method and the mathematization of nature. This new Geometrical Method was in no way a merely external way of presentation to rationalist philosophy. Instead, it constituted this philosophy. As much as rationalist philosophers differ in their philosophical systems, they all agree that human beings can arrive at a priori knowledge (through deducing from definitions), independently of experience, and that this knowledge is somehow “divine,” that is, as certain as God’s knowledge. In contrast, empiricists and theologians are eager to deny such a possibility. Thus, it is the Geometrical Method that provides the explanation for the two schools of early modern philosophy.

8. References and Further Reading

a. Abbreviations

A
- See Leibniz 1921ff.
AT
- See Descartes 1996.
CSM
- See Descartes 1985-88.
Loemker
- See Leibniz 1969.
Leibniz-Clarke
- See Leibniz 2000.
OL
- See Hobbes 1839.

b. Bibliography

Arnaud, Antoine/Pierre Nicole (2011), La logique ou l’art de penser, ed. crit. by Dominique Descotes, Champion: Paris.
Breger, Herbert (1991), “Der mechanizistische Denkstil in der Mathematik des 17.Jahrhunderts,” in: Hartmut Hecht (Ed.), Gottfried Wilhelm Leibniz im Philosophischen Diskurs über Geometrie und Erfahrung, Akademie Verlag, 15-46.
Breger, Herbert (2008), “Leibniz’s Calculation with Compendia,” in: Ursula Goldenbaum/Douglas Jesseph, Infinitesimal Differences. Controversies between Leibniz and His Contemporaries, De Gruyter: Berlin-New York, pp. 185-198.
Cassirer, Ernst (1974), Das Erkenntnisproblem in der Philosophie und Wissenschaft der neueren Zeit, 4 vols., Wissenschaftliche Buchgesellschaft: Darmstadt.
Corr, Charles A. (1972), “Christian Wolff’s Treatment of Scientific Discovery,” in: Journal of the History of Philosophy, 10, pp. 323-334.
Curley, Edwin (1986a), “Spinoza’s Geometric Method,” in: Studia Spinozana 2, Central Theme: Spinoza’s Epistemology, ed. by E. Curley, W. Klever, F. Mignini, Walther & Walther: Alling 1986, pp. 152-169.
Curley, Edwin (1986b), Behind the Geometrical Method, Princeton University Press.
De Dijn, Hermann (1986), “Conceptions of Philosophical Method in Spinoza: Logica and Mos Geometricus,” in: The Review of Metaphysics, XL, No. 1, Issue No. 157 (Sep), pp. 55-78.
Descartes, René (1985-88), The Philosophical Writings of Descartes, trans. by John Cottingham, Robert Stoothoff, and Dugald Murdoch, Cambridge University Press: Cambridge, New York, 3 vols.
Descartes, René (1996), Oeuvres de Descartes, ed. by Charles Adam & Paul Tannery, Librarie Philosophique J. Vrin: Paris, 11 vols.
Di Bella, Stefano (2005), The Science of the Individual: Leibniz’s Ontology of Individual Substance, Dordrecht.
Galilei, Galileo (1907), Opere. Edizione Nazionale, vol. 19, ed. by Antonio Favaro, Firenze: Barbèra.
Galilei, Galileo (1954), Dialogues Concerning Two New Sciences, trans. by Henry Crew & Alfons de Salvio, introd. By Antonio Favaro, Dover Publications: New York.
Galilei, Galileo (1960), “The Assayer,” in: Stilman Drake and C.D. O’Malley, The Controversy on the Comets of 1618, University of Pennsylvania Press: Philadelphia.
Galilei, Galileo (1967), Dialogue Concerning the Two Chief World Systems (2nd edition), trans. by Stillman Drake, University of California Press: Berkeley.
Goldenbaum, Ursula (1991), “Daß die Phänomene mit der Vernunft Übereinstimmen sollen. Spinozas Versuch einer Vermittlung von geometrischer Theorie und experimenteller Erfahrung,” in: Leibniz im philosophischen Diskurs über Geometrie und Erfahrung. Studien zur Ausarbeitung des Erfahrungsbegriffes in der neuzeitlichen Philosophie, ed. by Hartmut Hecht, Akademie Verlag: Berlin 1991, pp. 86-104.
Goldenbaum, Ursula (2004), Appell an das Publikum. Die öffentliche Debatte in der deutschen Aufklärung 1697-1786. Sieben Fallstudien, 2 Parts, Akademie Verlag: Berlin.
Goldenbaum, Ursula (2011), “Spinoza—ein toter Hund? Nicht für Christian Wolff,” in: Zeitschrift für Ideengeschichte, ed. by Johannes Ulrich Schneider, pp. 29-41.
Hobbes, Thomas (1839), Opera Philosophica quae latine scripsit omnia, ed. by Gulielmi Molesworth, Bohn: London.
Hobbes, Thomas (1994), Leviathan, with selected variants from the Latin edition of 1668, ed. by Edwin Curley, Hackett: Indianapolis.
Hubbeling, Hubertus Gesinus (1964), Spinoza’s Methodology, Van Gorcum: Assen.
Hubbeling, Hubertus Gesinus (1977), “The development of Spinoza’s axiomatic (geometric) method: The reconstructed geometric proof of the second letter of Spinoza’s correspondence and its relation to earlier and later versions,” in: Revue international de philosophie 31, pp. 53-68.
Jesseph, Douglas (1999), Squaring the Circle: The War between Hobbes and Wallis, Chicago Press: Chicago and London.
Kant, Immanuel (1998), Critique of Pure Reason, trans. and ed. by Paul Geyer and Allen W. Wood, Cambridge University Press: Cambridge, New York.
Kant , Immanuel (2000), Critique of the Power of Judgment, ed. by Paul Guyer, trans. by Paul Guyer and Eric Mathews, Cambridge University Press: Cambridge and New York.
Klever, Wim (1986), “Axioms in Spinoza’s Science and Philosophy of Science,” in: Studia Spinozana 2, Central Theme: Spinoza’s Epistemology, ed. by E. Curley, W. Klever, F. Mignini, Walther & Walther: Alling 1986, pp. 171-195.
Leibniz, Gottfried Wilhelm (1921ff.), Sämtliche Schriften und Briefe, Akademie Verlag: Berlin (quoted as A with Roman number of series and Arabic number of volume, followed by page number).
Leibniz, Gottfried Wilhelm (1969), Philosophical Papers and Letters, trans. and ed. by Leroy E. Loemker, Reidel: Dordrecht.
Leibniz, Gottfried Wilhelm (1996), New Essays on Human Understanding, trans. and ed. by Peter Remnant and Jonathan Bennett, Cambridge University Press: Cambridge, New York.
Leibniz, Gottfried Wilhelm/Samuel Clarke (2000), Correspondence, ed. by Roger Ariew, Hackett: Indianapolis.
Locke, John (1975), An Essay Concerning Human Understanding, ed. by Peter H. Nidditch, Oxford University Press.
Matheron, Alexandre (1986), “Spinoza and Euclidean Arithmetic: The Example of the Fourth Proportional,” (trans. by David Lachterman), in: Spinoza and the Sciences, ed. by Marjorie Greene and Debra Nails, Reidel: Dordrecht, pp. 125-50.
Maxwell, Vance (1988), “The Philosophical Method of Spinoza,” in: Dialogue 27 (1988), pp. 89-110.
More, Henry (1711), Epistola H. Mori ad V.C., quae Apologiam complectitur pro Cartesio, quaeque Introductionis loco esse poterit ad universam Philosophiam Cartesianam, 4^th ed., Londini.
Pascal, Blaise (2000), Œuvres completes, ed. by Michel le Guern, Gallimard: Paris, vol. II.
Schüling, Hermann (1969), Die Geschichte der axiomatischen Methode im 16. und beginnenden 17. Jahrhundert, Olms: Hildesheim, New York.
Spinoza (1744), B.v.S. Sittenlehre widerleget von dem berühmten Weltweisen unserer Zeit Herrn Christian Wolff, Varrentrapp: Frankfurt and Leipzig [Reprint: Hildesheim and New York: Olms 1981].
Spinoza (1985), The Collected Works of Spinoza, ed. and trans. by Edwin Curley, Princeton University Press.
Wolff, Christian (1734), A Discovery of the True Cause of the Wonderful Multiplication of Corn; With Some General Remarks upon the Nature of Trees and Plants, Printed for J. Roberts: London.
Wolff, Christian (1980), Gesammelte Werke, 1. Abt., Vol. 10, Olms: Hildesheim, New York.

Author Information

Ursula Goldenbaum
Email: ugolden@emory.edu
Emory University
U. S. A.

Phenomenological Psychology

Phenomenological psychology is the use of the phenomenological method to gain insights regarding topics related to psychology. Though researchers and thinkers throughout the history of philosophy have identified their work as contributing to phenomenological psychology, how people understand phenomenological psychology is a matter of some controversy. On the one hand, in light of contemporary philosophy’s affirmation of qualia as non-reducible, some understand phenomenological psychology to be merely a method for understanding subjective experience. When phenomenological psychology is understood this way, clarification is usually sought in terms such as “introspection” and “psychologism.” Put as a question, are the research methods identified as phenomenological and used in psychology ultimately the formalization of methods for gathering and preserving data regarding merely the subjective experience of (subjective and objective) events?

On the other hand, phenomenological psychology refers to the use of phenomenology to study the necessary and universal structures of experience. In this way, phenomenological psychology is grounded in transcendental analysis as a research method which analyzes the necessary conditions for the possibility of human experience. Whereas according to the former understanding, the results of such research supposedly have minimal to no universal generalizability, the latter understanding speaks of a cognitional structure universally generalizable to the human species. This article discusses the nature and history of phenomenological psychology, addressing the above distinct understandings of phenomenology as applied to psychology and the distinction between phenomenological and naturalistic psychology.

What is Phenomenology?
What is Psychology?
Which Husserl? Whose Phenomenology?
1. Husserl’s Five Different Introductions to Phenomenology
2. Husserl’s Three Different Ways to Phenomenological Reduction
Phenomenological Psychology as a Science
Phenomenological Psychology as the Analytic of Ontic Dasein
Conclusion
References and Further Reading

1. What is Phenomenology?

a. Method vs. Movement

Phenomenology may be understood as a method for investigating the cognitional structure of experience or as a movement in the history of philosophy. Given the heterodoxy of approaches and emphases in the history of philosophy to phenomenology, formal explications of phenomenology usually resist speaking as if “phenomenology” refers to a unified “school” of thought. Yet, when considered as a movement in the history of philosophy, Edmund Husserl (1859-1938) is identified as the founder of phenomenology, and when considered as a method Immanuel Kant (1724-1804) is identified as the progenitor of phenomenology.

It has become customary when discussing the origin of the term “phenomenology,” to refer to Christoph Friedrich Oetinger’s (compare Kant, 1900) 1762 use of the term and to invoke, following Martin Heidegger, a reference to Johann H. Lambert’s 1764 New Organon (Neues Organon) from where it appears Kant obtained the term. In a 1770 correspondence with Lambert, the outline of Kant’s appropriation of the term into the Critique of Pure Reason can already be seen. According to Kant,

The most universal laws of sensibility play an unjustifiably large role in metaphysics, where, after all, it is merely concepts and principles of pure reason that are at issue. It seems to me a quite particular, although merely negative science, general phenomenology (phaenomenologia generalis), must precede metaphysics. In it the principles of sensibility, their validity and their limitations, would be determined, so that these principles could not be confusedly applied to objects of pure reason (Kant, 1986, p. 59, translation slightly modified; compare Heidegger, 2005, p. 3).

Two pieces are of the utmost importance in this passage from Kant. First, Kant makes a distinction between the impure and the pure use of reason. Impure reason refers to the a priori aspects of experience, and these aspects are universal within the human experience. Further, impure reason is differentiated from pure reason insofar as impure reason includes what Kant in the above passage calls “sensibility.” Hence, “phenomenology,” for Kant, should be understood as the “science” that studies the aspects universal to human experience.

The second important piece of the Kant passage is his explicit description of phenomenology as determining the “principles of sensibility.” Here, “principle” should be understood in terms of the structural origins of human experience. In other words, Kant understands the principles of sensibility to belong to the order of necessary and universal conditions of human experience, a.k.a. the “structure of experience.” Already in this earliest definition by Kant, phenomenology pertains to human experience and, thereby, takes the first-person perspective of some subject as a point of departure. However, because phenomenology studies the universal and necessary aspects of such experience, it is neither merely subjective, nor concerned with a particular psychological subject.

G.W.F. Hegel inherited this understanding of phenomenology from Kant. According to Joseph Kockelmans, “it was only with Hegel that a well-defined technical meaning became attached” to the term phenomenology. For “Hegel, phenomenology was not knowledge of the Absolute-in-and-for-itself, in the spirit of Fichte or Schelling, but in his Phenomenology of Spirit [(Phänomenologie des Geistes)] he wanted to solely consider knowledge as it appears to consciousness” (Kockelmans, 1967, p. 24). Further, beyond the emergence of the term “phenomenology” in the eighteenth century, Heidegger traces its etymology to the terms phainomenon and logos in Aristotle, especially Book II of De Anima (On the Soul), where Aristotle discusses “seeing” (compare Heidegger, 2005, pp. 3-18).

It was not until the twentieth century, however, that a phenomenological “movement” is identified in the history of philosophy (compare Spiegelberg, 1965). Though Husserl is identified as the founder of this movement, the perplexities involved in understanding this movement as unified are discussed below. What is clear is that Husserl’s initial formulation of phenomenology was influenced by Franz Brentano (1838-1917). Not only is Brentano credited with identifying “intentionality” as the mark of the mental, at the University of Vienna “in his lectures on Descriptive Psychology (1889), Brentano employed the phrase ‘descriptive psychology or descriptive phenomenology’ to differentiate” a descriptive science of the mental “from genetic or physiological psychology” (Moran, 2000, p. 8). However, in what will be a central and career-long concern for Husserl, a descriptive phenomenology or psychology must avoid psychologism.

Though what is meant by psychologism is discussed below, it may be simply understood as the attempt to make objective reality depend upon the psychological features of some subject. For example, on the one hand, though some thing may be experienced differently by different humans, it is still the case that there is some thing to be experienced. That means it is not the case that the thing would be there for some humans and not for others. On the other hand, despite differences across human subjects (for example color blindness, mental illness, habitual tendencies) there are objective aspects of the experience of a thing which are universalizable across humans. Hence, phenomenology is not concerned with the non-universalizable.

b. Avoiding Psychologism

Though Husserl identifies more than one kind of psychologism, a characterization of Husserl’s phenomenology, insofar as it is an attempt to avoid psychologism, in general is possible. Psychologism for Husserl is a kind of relativism. In the two volume set titled Logical Investigations (1900-1901), which Husserl identified as his entry into phenomenology, psychologism is the theme of the entire first volume. There he notes, “Psychologism in all its subvarieties and individual elaborations is … relativism” (Husserl, 2001a, p. 82).

Generally stated, objective aspects of human experience are “psychologized” when “their objective sense, their sense as a species of objects having a peculiar essence, is denied in favor of the subjective mental occurrences, the data in immanent or psychological temporality” (Husserl, 1969, p. 169). According to Husserl, “the expression psychologism” applies to “any interpretation which converts objectivities into something psychological in the proper sense” (Husserl, 1969, p. 169; compare Hopkins, 2006). This is to say, that at any moment of some human subject’s experience the content of that moment may be differentiated between the objective and subjective aspects of the experience, and one is guilty of psychologism when one treats the objective (universalizable) aspects of the experience as if they are merely subjective. Though different subjects have different perspectives, to claim the reality of a situation is not universally true because it rather depends on the subjective determination of subjects is to be guilty of psychologism.

Husserl’s phenomenology, even his “descriptive” phenomenology, may be characterized as an attempt to avoid psychologism. In the second volume of Logical Investigations Husserl identifies the “exclusive concern” of phenomenology as

experiences intuitively seizable and analyzable in the pure generality of their essence, not experiences empirically perceived and treated as real facts, as experiences of human or animal experients in the phenomenal world that we posit as an empirical fact. This phenomenology must bring to pure expression, must describe in terms of their essential concepts and their governing formulae of essence, the essences which directly make themselves known in intuition, and the connections which have their roots purely in such essences. Each such statement of essence is an a priori statement in the highest sense of the word (Husserl, 2001b, p. 86).

Understanding Husserl’s phenomenology as engaged in a “war” (Husserl, 1969, p. 172) on psychologism helps clarify the actual relation between the various phenomenological psychology approaches to subjective experience and, at least, Husserl’s phenomenology, if not the “phenomenology movement” itself.

Not only is Husserl’s statement above helpful toward getting a sense of the theme of Husserl’s philosophy, it also invokes the important role of the a priori in his understanding of phenomenology. Contents of experience derived from the senses, that is the a posteriori, cannot provide universal and necessary knowledge. Similarly, “empiricism expressly teaches” “more or less vague probabilities resting on experience and induction, concerned with matters of fact in the life of man” (Husserl, 2001a, p. 56). Hence, Husserl’s concern to uncover the universal and necessary, that is the a priori, conditions of possible experience reveals a deep kinship with Kant’s critical philosophy generally, and specifically his Critique of Pure Reason (compare Kant, 1998; compare Allison, 1975; compare Heidegger, 1997).

This kinship is already indicated in the understanding of phenomenology as a method, often referred to as “transcendental analysis” or simply “phenomenology,” and Kant as the progenitor of this method. Yet, some phenomenological psychologists are still reluctant to acknowledge the value of Kant, though Husserl himself eventually affirmed the primacy of Kant’s thinking in such statements as the following: “The proof of this idealism is therefore phenomenology itself. Only someone who misunderstands either the deepest sense of intentional method, or that of transcendental reduction, or perhaps both, can attempt to separate phenomenology from transcendental idealism” (Husserl, 1999, p. 86). As an example, then, of someone who takes the method over the movement reading of phenomenology, Tom Rockmore in his Kant and Phenomenology provides a cogent characterization. According to Rockmore, Husserl “believed that he invented phenomenology and that earlier efforts, notably in Hegel, whom he seems to have known little about, but whom he criticized, were not significant” (Rockmore, 2011, p. 101). However, Rockmore goes further to explain,

Husserl depends on Kant in a number of ways: for example, his concern for philosophy as a rigorous science, his conception of phenomenology as transcendental idealism, the relation of transcendental phenomenology to the life-world, and, above all, the problem of psychologism. This problem, which arises in Kant’s criticism of Lockean so-called physiology, leads to a conception of the subject as a later version of the Kantian transcendental unity of apperception running through Husserl’s positon from beginning to end (Rockmore, 2011, p. 101).

Rather than address each of these aspects in Husserl’s phenomenology that are indebted to Kant, a brief discussion of “transcendental analysis”, combined with the above discussion of “psychologism,” should provide a sufficient base with which to grasp the following discussion of phenomenological psychology.

c. Transcendental Analysis and Attitude

How then, is “transcendental analysis” to be understood? In From Kant to Davidson, Andrew Carpenter concisely suggests, “Kant’s transcendental strategy involved investigating the necessary conditions for the possibility of experience” (Carpenter, 2003, p. 219). Carpenter then indicates three requirements. Firstly, “Identifying a phenomenon that one’s interlocutors agree exists.” Secondly, “Investigating the necessary conditions for the possibility of that phenomenon” (Carpenter, 2003, p. 219). Thirdly, “Examining the philosophical implications of the resulting ‘transcendental analysis’ of the possibility of the phenomenon [emphasis added]” (Carpenter, 2003, p. 219). This characterization correctly emphasizes transcendental analysis as a method with which to arrive at not the subjective characters of a phenomenon, but the necessary conditions for a phenomenon. Moreover, this characterization correctly illustrates the nature of the method of phenomenology, as transcendental analysis, by indicating the intermediate position of the method’s results. In other words, phenomenological disclosure of the conditions for the possibility of phenomena allows for a subsequent deeper understanding and discussion of the conditions.

This last insight, namely that the phenomenological method provides access to the necessary, and human species universal, a priori conditions for the possibility of experience, helps to contextualize Max Scheler’s (1874-1928) characterization of the “phenomenological attitude.” According to Scheler, phenomenology “is the name of an attitude of spiritual seeing in which one can see or experience something which otherwise remains hidden” (Scheler, 1973, p. 137). Then, understanding phenomenology as either a movement or method, it may also be understood as an “attitude.” Since a “method is a goal-directed procedure of thinking about facts, for example, induction or deduction” or “a particular procedure of observation and investigation, with or without experiment and with or without instrumental support for our senses, in the form of microscopes, telescopes, etc.” Scheler argues “Phenomenology, however, has a fundamentally different attitude. That which is seen and experienced is given only in the seeing and experiencing act itself … It does not simply stand there and let itself be observed” (Scheler, 1973, pp. 137-138). Hence, “attitude” refers to the relation to a phenomenon which allows it to show itself as itself (compare Heidegger, 1962, p. 51), when to a different attitude it would have shown itself differently. That the phenomenological attitude has the character of a science is ensured by the universality and necessity of what shows itself to observers who have gained such a relation to phenomena.

As the remaining sections explicate more fully, the discussion so far may already allow for a preliminary understanding of how phenomenology may be thought of as a descriptive psychology, and how a descriptive psychology may be understood as a phenomenological psychology. Whether considered as a movement, method, or attitude, phenomenology is understood to involve observation of phenomena yielding results of a specific kind. What is at stake, then, for observational research to be identified as phenomenological psychology, will involve the kind of results the research seeks to yield. Contextualizing phenomenological psychology as such, despite the claims of researchers from diverse movements utilizing diverse methods and with various attitudes to be engaged in some type of “phenomenology,” will help clarify whether such research is truly “phenomenological” psychology.

Consider that according to Aron Gurwitsch (1901-1973), “Husserl once referred to” Dorion Cairns (1901-1973) as “the future of phenomenology” in America, and as professor of philosophy and psychology and “arguably Husserl’s closest continuer” Cairns claimed, “It is an historical fact that Husserl’s investigations of subjectivity always had a philosophical goal. Their primary goal was never psychological. The results of his investigations can nevertheless be interpreted psychologically, as he himself indicated” (Cairns, 2010, pp. 1-2). Further, “A psychological interpretation of Husserl’s results is a simplification. The most abstruse of his methodological theories, the theory of transcendental-phenomenological reduction, is disregarded when his results are interpreted psychologically” (Cairns, 2010, p. 2). Yet, Cairns wavered, this should not stop “the psychologist who wants to discover in Husserl’s writings whatever is relevant to psychology as a natural science” (Cairns, 2010, p. 2).

2. What is Psychology?

a. Natural Science vs. Human Science

It is helpful to give a brief statement regarding the meaning of “psychology,” in order to understand to what “phenomenological psychology” is supposed to refer. Of all the many distinctions by which the science of psychology may be sub-divided, the distinction between psychology as a natural and as a non-natural science retains priority. This distinction may be seen throughout the entire history of philosophy and psychology (compare Brennan, 2002). Namely, the distinction is that between psychology as a natural science and psychology as a human science (compare Van Kaam, 1966).

Generally stated, psychology as a natural science seeks to account for psychological phenomena as natural phenomena, and psychology as a human science seeks to account for psychological phenomena as human, social, and cultural phenomena. Whereas the methods of psychology as a natural science tend toward those found in biology, chemistry or physics, the methods of psychology as a human science tend toward those found in history, sociology, and anthropology. There is currently a good deal of debate regarding whether phenomenology should be considered only a method viable for psychology as a human science or as both a human and natural science. Hence, how phenomenological psychology is to be understood is a matter of some controversy.

It is, therefore, insufficient to simply suggest, along with the Oxford Encyclopedia of Psychology, that “The term phenomenological is often used by psychologists to refer simply to the subjective point of view” (Kazdin, 2000, p. 162). On one hand, phenomenological analysis proper seeks the universal and necessary conditions for the possibility of human experiential phenomena. On the other hand, there is a paradigm for research in psychology as a natural science that seeks to isolate subjective phenomena, for example qualia, for example,, for the sake of discovering a correlation with natural phenomena such as electro-chemical activity of the central nervous system. Despite a departure from phenomenology proper, phenomenological psychology still refers, though ambiguously, to meaningful research projects; however, the specific difference between phenomenological and non-phenomenological projects in psychology is not “simply” “the subjective point of view” (compare Husserl, 1977, pp. 110-115).

b. Naturalistic vs. Personalistic Standpoint

Husserl was aware of the different approaches to psychology as a science, and though subjective phenomena qua subjective, as both Husserl and Cairns above explained, are not properly “phenomenological,” there is a distinction from Husserl’s work which may help further clarify the meaning of phenomenological psychology. In Book II of Husserl’s Ideas Pertaining to a Pure Phenomenology and to a Phenomenological Philosophy, he characterizes both of these approaches to psychology as depending upon two different types of the specific, and properly, phenomenological-transcendental attitude. In other words, this is his distinction between a “naturalistic attitude” and a “personalistic attitude.” Husserl notes phenomenologists can move “quite effortlessly, from one attitude into another, from the naturalistic into the personalistic, and as to the respective sciences, from the natural sciences into the human sciences” (Husserl, 2000, p. 190). Moreover, the personalistic attitude is “the attitude we are always in when we live with one another, talk to one another, shake hands with one another…” (Husserl, 2000, p. 192).

At this point, a number of different ways to identify generally the relation between psychology and phenomenology are available. Firstly, some part or portion of psychology may be seen as the study of merely subjective phenomena, and such a psychology would, thereby, incorrectly be called “phenomenological” in the proper philosophical sense. Moreover, even if subjective concerns in psychology are not the results of introspection, they pertain exclusively to empirical phenomena and would not be properly “phenomenological.” Secondly, the topics and themes of psychology may be seen as resulting from an attitude between a natural attitude and the properly phenomenological-transcendental attitude. In this way, the study of such topics and themes should lead ultimately to consideration of the transcendental features involved. Thirdly, psychology as a whole may be divided into the different attitudes of the naturalistic and personalistic with research in psychology as a natural science and as a human science resulting from these, respectively, and with both attitudes subordinated to the properly phenomenological attitude (compare Husserl, 1977, p. 166). Notice, in this way all phenomena, as phenomena of human experience, fall within the scope of phenomenology proper; however, it points to a significant confusion on the part of the psychologist when the non-universal, non-necessary aspects of the phenomena are taken as the features to be studied through phenomenological science. Hence, it is as if these three general identifications relate to one another circularly, since failure to accomplish the transcendental-phenomenological viewpoint of the third may place the psychologist, studying merely subjective phenomena, back at the first.

c. Elimination vs. Reduction vs. Supervenience

From the properly phenomenological perspective of the third general identification, then, the following comments by Kant and Husserl are understood more easily. Kant famously argued in the Preface to his Metaphysical Foundations of Natural Science that empirical psychology can never be a proper natural science (Kant, 2004, p. 7). For Kant, the naturalization of psychology suggests a denial of free will in humans, a position his philosophy fundamentally rejects. Similarly Husserl complained, “What is needed is a new ‘psychology’ of an essentially different type, a universal science of the spirit that is neither ‘psychophysical’ nor natural-scientific” (Husserl, 2000, p. 181; compare Husserl, 1970).

Yet, as indicated with the primary division of psychology into natural and human science, psychology tends to take a psychophysical understanding of human being as a point of departure for further research (compare [../hard-con/]). In fact, psychologists may be classified by a taxonomy of relations between the psychological and the physical. There are those who seek an elimination of either the psychological or the physical in favor of the other, and there are a number of ways to take up such a position. However, the most popular of such ways today is, perhaps, “eliminative materialism” (compare Churchland, 1981). Next, there are those who seek a reduction of either one of the psychological, or the physical, to the other. Though, again, it seems more popular and plausible today to find the reduction of the psychological to the physical advocated. Lastly, there are those who seek to characterize the relation in terms of supervenience. The perhaps most popular articulation suggests that psychological states cannot be eliminated in favor of, or reduced to, physical states; however, there can be no changes to psychological states without there being accompanying changes to physical states (compare Kim, 1984; compare Kim, 1987).

Exemplified by his books Mind in a Physical World (1998) and Physicalism, Or Something Near Enough (2005), Jaegwon Kim arrives at a position which privileges the physical over the psychological, while characterizing the relation between the two as one of “conditional functional reduction.” Now, to say that some mental property is “functionalizable” is to say that its presence as a property of consciousness can be associated with the function it serves regarding the physical environment. Hence, though Kim affirms the irreducibility of the qualitative phenomenal properties (qualia) of consciousness to physical properties, there is a conditional reduction of qualia to the functional role they play regarding the organism’s adaptation to the environment. Insofar as these positions regarding the psychophysical constitution of human beings indicate the context of the elements involved in research identified as within phenomenological psychology, and with the avowed goal of “naturalizing” the phenomenology of qualia, (compare Varela, 1992) how might Husserl see such research projects?

To sketch a brief response to this question, beyond the gestures already made above (for example, the third general identification of phenomenological psychology), consider the following comments from a section titled “The delimitation of somatology and psychology” in Book III of Ideas. According to Husserl,

What one has here, from the point of view of natural science, is a number of individual human beings each with a particular consciousness, a particular psyche … belonging to each. In the psycho-physical interrelated context that is made possible by the material interrelations of the animate organisms, there arise in the individual psyches acts that are intentionally directed at something psychically external. But what appears here is always only new states of the individual psyches (Husserl, 1980, p. 18).

Later in the same book, Husserl clarifies,

As we know, there come continually into consideration in the phenomenological exploration of the acts both consciousness itself and the correlate of consciousness, noesis and noema. To describe and determine according to essence the phenomenon of intuition of a physical thing … is at the same time also to keep in mind that the act in itself is the “meaning” of something and that what is meant as such is “physical thing.” But to substantiate this, indeed, to make what is meant as physical thing as such, namely as correlate (something perceived as such with regard to the perception, something named as such with regard to the naming), the object of research … that is not to explore physical things, physical things as such. A “physical thing” as correlate is not a physical thing; therefore the quotation marks (Husserl, 1980, p. 72).

Simply put, “one must not confuse noema (correlate) and essence” (Husserl, 1980, p. 73). Wherever we go, we bring the necessary and universal conditions for the possibility of experience to our experiences. Both the naturalization project and the merely subjective point of view project are misidentified with phenomenological psychology, considering phenomenology proper; moreover, both of these projects may fail at avoiding psychologism (compare Husserl, 2001b, p. 86, quoted above; compare Husserl, 1977, p. 38).

3. Which Husserl? Whose Phenomenology?

a. Husserl’s Five Different Introductions to Phenomenology

As David Carr discusses in his “Translator’s Introduction” to Husserl’s The Crisis of the European Science and Transcendental Phenomenology: An Introduction to Phenomenological Philosophy, Husserl produced a number of different “introductions” to phenomenology. However, as the above discussion of the progressive movement to transcendental phenomenology shows, there is a continuity to be discerned across the introductions (compare McKenna, 1982). Yet, at the same time, Husserl’s continued attempt to “introduce” phenomenology is widely seen as contributing to the controversy regarding the meaning of the term “phenomenology” itself (compare Spiegelberg, 1965). As Rockmore put it, “Husserl’s unconvincing claim to have invented phenomenology, which he struggles to define in a long series of texts, leaves both the meaning of the term, the genesis of the approach, and its import unresolved” (Rockmore, 2011, p. 191). According to Carr, Husserl attempts an introduction to phenomenology in all of the following books: Logical Investigations (1900); Ideas (1913); Formal and Transcendental Logic (1929); Cartesian Meditations (1931); The Crisis of the European Sciences (1937).

Further, as William McKenna mentions in his Husserl’s “Introductions to Phenomenology” and Iso Kern explicates in his article, “The Three Ways to the Transcendental Phenomenological Reduction,” these five books point to three ways to the much-discussed phenomenological reduction . Iso Kern, following and clarifying Hans-Georg Gadamer, indicates a “Cartesian way,” a way through “intentional psychology,” and a way through “ontology” into the “transcendental phenomenological reduction” (Kern, 1977, p. 126). Kern suggests these three ways are “not always sharply and clearly separated” in Husserl’s work. These ways may be seen as responses to questions such as “Through which steps in thinking does philosophic cognition arise?” and “How does knowing emerge from the aphilosophical life and become genuinely philosophical?” (Kern, 1977, p. 126).

b. Husserl’s Three Different Ways to Phenomenological Reduction

Since each of the ways explained by Kern are ways into the transcendental phenomenological attitude, only their differences will be briefly characterized here. The characterization of their differences is helpful toward clarifying what is meant by phenomenological psychology. This is because across the differing introductions, it is not difficult to lose sight of the many different unifying themes with which to coherently understand the relation between phenomenology and psychology. The key is to see that the introductions, rather than being set against one another, should be unified around Husserl’s attempts to instruct readers into the transcendental phenomenological attitude.

The Cartesian way seeks an absolute starting point from which philosophy may be understood as a science. This starting point demands absolute evidence, and this means simply clear and distinct evidence that cannot be doubted. Belief in the mind-external world is then to be doubted, since there is supposed to be no absolute evidence for belief in the mind-external world. Yet, knowledge about the world is based on belief in the world’s existence, and experience of what was previously believed to be the world does not cease when belief in the world is doubted. Hence, this relation to the experience of the “world,” is a reduced relation. The final step in the Cartesian way is to understand the intentional relation to the “world” as that of the “cogito,” that is the intentionality of the acting ego, such that the cogito provides absolute evidence for itself as the starting point for philosophy understood as a science. Notice, phenomenology involves understanding how the intentional structure of the subject provides objective knowledge of the mind-external world, and as such phenomenology’s interest in the intentional structure of the subject is not “subjective.”

The way through “intentional psychology,” then, according to Kern, takes the “physical sciences, which are interested purely in the physical and abstracts, from everything psychic. In opposition to these sciences, Husserl conceives the idea of a complementary science which is interested purely in the psychic and abstracts from everything physical” (compare Kern, 1977, p. 134). By pointing out that relations between objects in the lived experience of humans are not relations between those objects in mind-external reality, Husserl points the way to “lived experience.” This may be compared to the focus on the intentional relation to the “world” in the Cartesian way. Moreover, the lived experience pertains to the subject, but it is not “subjective.” Kern provides the following two quotes from Husserl’s The Crisis of the European Sciences as convincing evidence of Husserl’s view: “Psychology, the universal science of the purely psychic in general – therein consists its abstraction” (Husserl, 1978, p. 252) and “in the pure development of the idea of a descriptive psychology, which seeks to bring to expression what is essentially proper to souls, there necessarily occurs a transformation of the phenomenological-psychological epoché and reduction into the transcendental” (Husserl, 1978, p. 257).

Lastly, the “ontological” way may be seen as a direct attack on the psychologist who might mistakenly think phenomenology to refer “simply to the subjective point of view” (compare Kazdin, 2000, p. 162). According to Kern, “Rather, the objective ‘theme’ is implied intentionally in the subjective ‘theme’ (in the intentional life of subjectivity)” (compare Kern, 1977, p. 137). Further, “The change of attitude is to be compared with the transition from the second to the third dimension of space, which contains in itself the second dimension. This subjectivity [emphasis added], in which everything objective is constituted, is the transcendental one” (Kern, 1977, p. 137). Hence, the psychologist who takes phenomenological psychology to be an investigation of “the subjective point of view” understood as a “perspective through which the individual experiences his or her world [emphasis added]” (compare Kazdin, 2000, p. 164) is not actually engaged in phenomenological psychology. Further, the popular tendency to emphasize a subject’s “perspective” as transcending both other subjects and the potential truth value of criticism from other subjects stems from a misunderstanding of phenomenological psychology. As Kern explains, “This subjectivity … is exhibited as an intersubjectivity, made communal through the common objectivity,” and this science is an “exploration of the universal transcendental life, in which worldly objectivity [emphasis added], with its ontological a priori, is constituted” (Kern, 1977, p. 137).

Though an exhaustive list of phenomenologists is outside the scope of this article, what follows is a brief list of major figures in phenomenology. The purpose of this list is to suggest that, despite the heterogeneity of approaches across the figures peopling the list, as far as these individuals were engaged in phenomenology, they participated in a method grounded in the transcendental attitude. These figures include: Edmund Husserl; Martin Heidegger; Jean-Paul Sartre; Maurice Merleau-Ponty; Max Scheler; Edith Stein; Adolf Reinach; Moritz Geiger; Roman Ingarden; Dietrich von Hildebrand; Aron Gurwitsch; and Gabriel Marcel, among many others.

4. Phenomenological Psychology as a Science

a. Phenomenology vs. Phenomenography

As should be clear, phenomenological psychology, as a science, concerns itself with what is necessary and universal in human experience. This is opposed to the approach to human experience that seeks to record subjective experience as subjective. Such an approach, rather than be called “phenomenological,” is better referred to as “phenomenographical” (compare Marton, 1986). Whereas “phenomenology” refers to the study of what is objective in subjective experience, including the structures of subjectivity itself, “phenomenography” refers to the study of what is subjective in subjective experience.

With this distinction in mind, there are a number of research methods classified as within phenomenological psychology to consider. In Phenomenological Psychology: Theory, Research, and Method, Darren Langbridge explains, “when applying phenomenological philosophy to psychology, we aim to focus on people’s perceptions of the world in which they live and what this means to them: a focus on people’s lived experience” (Langbridge, 2007, p. 4). Langbridge links “developments” of phenomenology in philosophy with their corresponding research methods in psychology. For example, he claims “phenomenology” refers to a “descriptive approach,” “existentialism” refers to an “interpretive approach,” and “hermeneutics,” refers to a “narrative approach” (Langbridge, 2007, p. 5). Though not listed by Langbridge, the perhaps most promising of the approaches to phenomenological psychology may be seen in Aron Gurwitsch’s work in the phenomenology of Gestalt psychology (compare Gurwitsch, 1966).

b. Descriptive Phenomenology

Descriptive phenomenology, as seen for example in Amedeo Giorgi’s The Descriptive Phenomenological Method in Psychology, results from not a “transcendental” attitude but one “more appropriate for psychological analyses of human beings since the purpose of psychology as a human science is precisely the clarification of the meanings of phenomena experienced by human persons” (Giorgi, 2009, p. 98). Associating phenomenological psychology with psychology as a human science, Giorgi suggests that in “psychology as a human science … The priority of an already existing methodology is not posited. Rather, what is posited as the privileged position is fidelity to the phenomenon” (Giorgi, 1971, p. 52). Hence, in the “descriptive phenomenological method in psychology” Giorgi explains, “The situations to be described are selected by the participants themselves and what is sought is simply a description that is as faithful as possible” (Giorgi, 2009, p. 96; compare Gilbert and Fisher, 2006; compare MacLeod, 2002; compare Loftus, 1979). Further, Giorgi acknowledges “The fact that the descriptions come from others could be challenged from a phenomenological perspective … but the descriptions provided by the experiencers are an opening into the world of the other [emphasis added] that is shareable” (Giorgi, 2009, p. 96).

c. Interpretive-Hermeneutic Phenomenology

Without discussing the other “developments” of phenomenological psychology here, the following two examples should suffice to contextualize how these developments relate to the descriptive approach. On the one hand, regarding an “Interpretative Phenomenological Analysis,” it is claimed,“One is trying to get close to the participant’s personal world” (Smith and Osborn, 2003, p. 51). On the other hand, it is suggested that the “research results” of such interpretive activities open “upon a limitless field of possible interpretations” (compare Kazdin, 2000, p. 164). Though it is not immediately clear how the results of any research could be subject to “limitless” interpretations, supposing such a characterization were true, it is also not clear what the purpose of research in psychology that is open to “limitless” interpretation might be. Hence, the controversy and challenges remain for phenomenological psychology. That is to say, the psychological sciences that self-identify as phenomenological may be interrogated regarding whether they avoid psychologism and whether they might be better classified as phenomenographic.

5. Phenomenological Psychology as the Analytic of Ontic Dasein

a. Heidegger and Science

As exemplified by work found in the Zollikon Seminars, Martin Heidegger has provided a number of valuable insights into how phenomenology may relate to psychology. This is despite the commonly held misconceptions regarding Heidegger’s relation to science. For a clear and concise discussion regarding Heidegger’s relation to science, see Joseph Kockelmans chapter titled “Heidegger on the Essential Difference and Necessary Relationship Between Philosophy and Science” (Kockelmans, 1970, pp. 147-167). According to Kockelmans, Heidegger does indeed see an “unbridgeable gap between philosophy and science.” Yet, “Although scientists generally interpret this view of Heidegger’s as a disparaging one, this is in no way his intention” (Kockelmans, 1970, p. 148).

In the November 23^rd 1965 seminar of the Zollikon Seminars Heidegger explicitly states his position regarding “science.” Heidegger declares, “I have reservations about science – not science as science – but only about the absolute claims of natural science” (Heidegger, 2001, p. 123; compare Heidegger, 1972, p. 77; compare Caputo, 1973; compare Krell, 2008, p. 12). From this discussion, Heidegger provides his understanding of the distinction between psychology and philosophy, and this distinction applies to phenomenology in essentially the same way it was reflected on above in Husserl. That is to say, Heidegger suggests phenomenological psychology is intermediate to phenomenological transcendental philosophy. In Heidegger’s vocabulary, this means that phenomenological psychology is “ontic” and phenomenological transcendental philosophy is “ontological.”

b. Heidegger and Psychology

What this means for Heidegger is that when phenomenology is used as a method to understand being, then phenomenology is used philosophically, and when phenomenology is used as a method to understand being as human being, then it is used psychologically or anthropologically. Put another way, “ontic” refers to the facts related to human being-in-the-world, and “ontological” refers to the conditions for the possibility of being-in-the-world. Since being is a condition for the possibility of being-in-the-world, an analysis of being will yield ontological insights. Heidegger clarifies, despite the similarity of the language, “Daseinanalysis is ontic. The analytic of Dasein is ontological” (Heidegger, 2001, p. 124). Further, “in Being and Time there was often talk about ‘Daseinanalysis.’ In this context, Daseinanalysis does not mean anything more than the actual exhibition of the determination of Da-sein as thematized in the analytic of Da-sein” (Heidegger, 2001, p. 125). Similar to the discussion of the possibility of phenomenological psychology regarding Husserl above, this is an important distinction for phenomenological psychology in Heidegger, since “Insofar as the latter is defined as existence, these determinations of Da-sein are called existentialia (compare Keen, 1975). Therefore, the concept of ‘Daseinanalysis’ [in contrast to psychological ‘Dasein-analysis’] still belongs to the analytic of Dasein and, therefore, to ontology” (Heidegger, 2001, p. 125).

To be clear, beings may be described in terms of cultural and historical facts. However, such descriptions fall short of understanding being as the condition for the possibility of beings. Frederick Wilhelmsen (1923-1996) famously described this difference in terms of beings as nouns and be-ing as a participle. Heidegger’s point here, then: it is not so much the case that ontic concerns are psychologistic (though they may be) as it is the case that they fall short of authentic ontological insights. What this means for phenomenological psychology is that insofar as it merely views the ontic fact domain of (human) being, then, according to Heidegger (like Kant and Husserl before him), it falls short of the transcendental attitude. However, just as descriptive psychology was seen above as intermediate on the way to the transcendental attitude, it is possible to interrogate the facts of human being through transcendental analysis, and such an interrogation leads to the “conditions for the possibility” of such facts. The analysis of these conditions, then, is the “analytic of Da-sein.” Hence, phenomenological psychology is not an exclusive enterprise insofar as the phenomenologically trained psychologist can, through such an analytic, rise to transcendental phenomenology and study ontology; though in doing so, they are no longer studying psychology. That is to say, on the one hand, psychology is clearly delimited from ontology. On the other hand, psychology is grounded in ontology. There can be no human being, if there is no be-ing. So, what is the value of phenomenology for psychology?

c. The Therapeutic Value of Minding the Clearing

The term “existential” should invoke the notion of freedom. As disclosing the existentials (existentialia), then, phenomenology may be used as a method toward an awareness, which is psychologically therapeutic, in its affirmation of human freedom. Just as existentialism and freedom belong together, so too awareness of the conditions making human experiences possible, when considered from the first-person perspective regarding lived experience, may be therapeutic. In essence this is the training of a client seeking psychotherapy to perform a phenomenological reduction to accomplish a transcendental attitude to their own lived experience. This is Da-sein analysis. This may be accomplished through analysis of the existentials conditioning the person-seeking-therapy’s being. Ultimately this is ontology, through psychology, not psychology; however, it is still related to psychology as being psychotherapeutic. By bringing each (human) being to an awareness of the clearing of being in which their being human in accomplished, they may “take hold of” their being differently (compare Heidegger, 1962), and this is an affirmation of the person’s freedom, which may be therapeutic given the everyday possibilities through which humans may forget the be-ing which allows beings to be.

6. Conclusion

The above discussion of phenomenology from the perspectives of a movement, a method and an attitude, clarified by examining shifts found in Husserl’s work, provided support to the value of understanding phenomenology as related to transcendental philosophy. Further, such an understanding of phenomenology elucidates the consistent thread running through the heterogeneous styles of the major figures standardly considered phenomenologists. In order to clarify further the meaning of phenomenological psychology as a science, phenomenology was contrasted with phenomenography. Phenomenography refers to the study of the merely subjective aspects of experience. Toward clarifying possible confusion regarding the potential use of phenomenology for psychology, the claim was made that much of was is called “phenomenology” today is actually phenomenography. This is an important insight involving an important distinction, and perhaps with further dissemination the controversy surrounding phenomenology will be resolved.

Lastly, Heidegger’s style of phenomenology and its relation to psychology was discussed. This included clarification, through Heidegger’s own words, of his position regarding science. Heidegger’s Da-sein analysis continues to have influence around the globe as a viable psychotherapeutic method. Interestingly, Heidegger’s Da-sein analysis, though expressed near the end of his career, has deep ties with and resonates with his Being and Time. Yet, this also extends Heidegger’s value and influence beyond even academic philosophy and psychology, since Heidegger’s philosophy, as a kind of therapy does not, necessarily, require a therapist. That is to say, Heidegger’s teaching regard the first-person perspective in such a way that it becomes possible for readers in understanding his vocabulary to begin to “see” being as he described it. The therapeutic value involved then, points further to the efficacious presence of philosophy in psychology and phenomenological psychology.

7. References and Further Reading

Akerlind, Gerlese S. (2005). “Variation and commonality in phenomenographic research methods.” Higher Education Research & Development 24.4: 321-334.
Allison, Henry. (1975). The Critique of Pure Reason as Transcendental Phenomenology. In D. Ihde ad R.M. Zaner, (Eds.) Dialogues in Phenomenology. (pp. 137-155). The Hague: Martinus Nijhoff.
Brennan, James F. (2002). History and Systems of Psychology. New York: Pearson Prentice Hall.
Cairns, Dorion. (2010). “Nine Fragments on Psychological Phenomenology,” Journal of Phenomenological Psychology 41: 1-27.
Caputo, John D. (1973). “Language, Logic, and Time.” Research in Phenomenology 3: 174-155.
Carpenter, Andrew. (2003). Davidson’s Transcendental Argumentation. In J. Malpas, (Ed.). From Kant to Davidson: Philosophy and the Idea of the Transcendental. (pp. 219-237). London: Routledge.
Churchland, Paul. (1981). Eliminative Materialism and the Propositional Attitudes. The Journal of Philosophy, 78.2: 67-90.
Deleuze, Gilles. (1984). Kant’s Critical Philosophy, translated by H. Tomlinson and B. Habberjam. Minneapolis: University of Minnesota Press.
Dufrenne, Mikel. (1966). The Notion of the A Priori. Evanston, IL: Northwestern University Press.
Dreyfus, Hubert. (2000). Responses. In M. Wrathall and J. Malpas, Ed. Heidegger, Coping, and Cognitive Science. (pp. 313-351). Cambridge, MA: MIT Press.
Ensign, Grayson H. and Howe, Edward. (1989). Counseling and Demonization: The Missing Link. Amarillo, Tx: Recovery Publications.
Farber, Marvin. (1968). The Foundation of Phenomenology. Albany: SUNY Press.
Feigl, Herbert. (1959). “Philosophical Embarrassments of Psychology,” American Psychologist, 14: 115-128.
Frankfurt, Harry G. (1978). “The Problem of Action.” American Philosophical Quarterly 15: 157-162.
Gilbert, Julian A. E. and Fisher, Ronald P. (2006). “The effects of varied retrieval cues on reminiscence in eyewitness memory,” Applied Cognitive Psychology 20.6: 723-729.
Giorgi, Amedeo. (2009). The Descriptive Phenomenological Method in Psychology: A Modified Husserlian Approach. Pittsburgh: Duquesne University Press.
Giorgi, Amedeo. (1971). “The Experience of the Subject as a Source of Data in Psychological Experiment,” in Phenomenological Psychology, Vol. I. A. Giorgi, W.F. Fischer, R. von Eckartsberg, (Eds). (pp.50-57). Pittsburgh: Duquesne University Press.
Gurwitsch, Aron. (1964). The Field of Consciousness. Pittsburgh: Duquesne University Press.
Gurwitsch, Aron. (1974). Phenomenology and the Theory of Science. Evanston, IL: Northwestern University Press.
Gurwitsch, Aron. (1966). Studies in Phenomenology and Psychology. Evanston, IL: Northwestern University Press.
Hegel, G.W.F. Phenomenology of Spirit, translated by A.V. Miller. Oxford: Oxford University Press, 1977.
Heidegger, Martin. The Basic Problems of Phenomenology, translated by Albert Hofstadter, Bloomington: Indiana University Press, 1988.
Heidegger, Martin. Being and Time, translated by J. Macquarrie and E. Robinson. Oxford: Basil Blackwell, 1962.
Heidegger, Martin. History of the Concept of Time, translated by T. Kisiel, Bloomington: Indiana University Press, 1985.
Heidegger, Martin. Introduction to Phenomenological Research, translated by D.O. Dahlstrom. Bloomington: Indiana University Press, 2005.
Heidegger, Martin. Kant and the Problem of Metaphysics, translated by R. Taft, Bloomington: Indiana University Press, 1997.
Heidegger, Martin. Mindfulness, translated by P. Emad and T. Kalary. New York: Bloomsbury Academic, 2006.
Heidegger, Martin. On Time and Being, translated by J. Stambaugh. New York: Harper & Row, 1972.
Heidegger, Martin. What is a Thing? translated by W.B. Barton and V. Deutsch. Chicago: Henry Regnery, 1967.
Heidegger, Martin. Zollikon Seminars, translated by F. Mayr and R. Askay, Evanston, IL: Northwestern University, 2001.
Hopkins, Burt C. (2006). “Husserl’s Psychologism, and Critique of Psychologism, Revisited,” Husserl Studies 22: 91-119.
Husserl, Edmund. (1999). Cartesian Meditations, translated by D. Cairns. Dordrecht: Kluwer.
Husserl, Edmund. (1970). The Crisis of European Sciences and Transcendental Phenomenology: An Introduction to Phenomenological Philosophy, translated by D. Carr. Evanston: Northwestern University.
Husserl, Edmund. (1983). Ideas Pertaining to a Pure Phenomenology and to a Phenomenological Philosophy, Book I, T.E. Klein and W.E. Pohl, (Trans.). The Hague: Matinus Nijhoff.
Husserl, Edmund. (2000). Ideas Pertaining to a Pure Phenomenology and to a Phenomenological Philosophy, Book II, T.E. Klein and W.E. Pohl, (Trans.). The Hague: Matinus Nijhoff.
Husserl, Edmund. (1980). Ideas Pertaining to a Pure Phenomenology and to a Phenomenological Philosophy, Book III, T.E. Klein and W.E. Pohl, (Trans.). The Hague: Matinus Nijhoff.
Husserl, Edmund. (1969). Formal and Transcendental Logic, translated by D. Cairns. The Hague: Matinus Nijhoff.
Husserl (2001a). Logical Investigations, Vol. I., translated by D. Moran. New York: Routledge.
Husserl, Edmund. (1977). Phenomenological Psychology: Lectures, Summer Semester, 1925, translated by J. Scanlon. The Hague: Matinus Nijhoff.
Husserl (2001b). Shorter Investigations, translated by Dermot Moran. New York: Routledge.
Kant, Immanuel. (1998). Critique of Pure Reason, translated by P. Guyer and A.W. Wood. Cambridge: Cambridge University Press.
Kant, Immanuel. (1900). Dreams of a Spirit-Seer, translated by Emanuel Goerwitz. London: Swan Sonnenschein & Co.
Kant, Immanuel. (2004). Metaphysical Foundations of Natural Science, translated by M. Friedman. Cambridge: Cambridge University Press.
Kant, Immanuel. (1986). Philosophical Correspondence, 1759-1799, translated by A. Zweig. Chicago: University of Chicago Press.
Kazdin, Alan E. (2000). Encyclopedia of Psychology, Vol. VI. Oxford: Oxford University Press.
Keen, Ernest. (1975). A Primer in Phenomenological Psychology. Dallas, TX: Holt, Rhinehart and Winston, Inc.
Kern, Iso. (1977). “The Three Ways to the Transcendental Phenomenological Reduction,” in F. Elliston and P. McCormick, (Eds.). Husserl: Expositions and Appraisals. (pp. 126-149). South Bend, IN: University of Notre Dame Press.
Kim, Jaegwon. (1984). “Concepts of Supervenience,” Philosophy and Phenomenological Research 45: 153-176.
Kim, Jaegwon. (1998). Mind in a Physical World. Cambridge: Cambridge University Press.
Kim, Jaegwon. (2005). Physicalism, Or Something Near Enough. Princeton, NJ: Princeton University Press.
Kim, Jaegwon. (1987). “‘Strong’ and ‘Global’ Supervenience Revisited,” Philosophy and Phenomenological Research 48: 315-326.
Kitcher, Patricia. (1993). Kant’s Transcendental Psychology. Oxford: Oxford University Press.
Klein, D.B. (1970). A History of Scientific Psychology. New York: Basic Books.
Kockelmans, Joseph J. (1970). “Heidegger on the Essential Difference and Necessary Relationship Between Philosophy and Science,” in Kockelmans, Joseph J. and Kisiel, Theodore J. (Eds). Phenomenology and the Natural Sciences: Essays and Translations. (pp. 147-166). Evanston, IL: Northwestern University Press.
Kockelmans, Joseph J. (1967). Phenomenology: The Philosophy of Edmund Husserl and Its Interpretation. New York: Doubleday.
Krell, David F. (2008). “General Introduction: The Question of Being,” in Basic Writings: Martin Heidegger. (pp. 1-36). New York: Harper Perennial.
Langdridge, Darren. (2007). Phenomenological Psychology: Theory, Research and Method. New York: Pearson Prentice Hall.
Loftus, Elizabeth (1980). “Impact of expert psychological testimony on the unreliability of eyewitness identification.” Journal of Applied Psychology 65.1: 9-15.
MacLeod, Malcolm. (2002). “Retrieval-induced forgetting in eyewitness memory: forgetting as a consequence of remembering,” Applied Cognitive Psychology 16.2: 135-149.
Marton, Ference. (1986). “Phenomenography – A research approach investigating different understandings of reality.” Journal of Thought 21.2: 28-49.
McCall, Raymond J. (1983). Phenomenological Psychology. London: University of Wisconsin Press.
McKenna, William. (1982). Husserl’s “Introductions” to Phenomenology. Dordrecht: Kluwer.
Merleau-Ponty, Maurice. (2002). Phenomenology of Perception, translated by C. Smith. London: Routledge Classics.
Moran, Dermot. (2000). Introduction to Phenomenology. New York: Routledge.
Nagel, Thomas. (1974). “What Is It Like To Be a Bat?” Philosophical Review, 83: 435-450.
Place, U.T. (956). “Is Consciousness a Brain Process?” British Journal of Psychology 47: 44-50.
Rockmore, Tom. Kant and Phenomenology. Chicago: University of Chicago Press, 2011.
Rust, John. (1987). “Is Psychology a Cognitive Science?,” Journal of Applied Philosophy 4.1: 49-55.
Sartre, Jean-Paul. Being and Nothingness: An Essay in Phenomenological Ontology, translated by Hazel Barnes. New York: Citadel Press, 2001.
Sheehan, Thomas. (2001). “A Paradigm Shift in Heidegger Research,” Continental Philosophy Review, 34.2: 1-20.
Scheler, Max. (2008). The Constitution of the Human Being, translated by J. Cutting. Milwaukee: Marquette University.
Scheler, Max. (1973). “Phenomenology and the Theory of Cognition,” in Selected Philosophical Essays, translated by D.R. Lachterman. (pp. 136-201). Evanston, IL: Northwestern University, 2001.
Spiegelberg, Herbert. (1972). Phenomenology in Psychology and Psychiatry. Evanston, IL: Northwestern University Press.
Spiegelberg, Herbert. (1965). The Phenomenological Movement, Vol. I. The Hague: Matinus Nijhoff.
Stein, Edith. (2009). Potency and Act, translated by W. Redmond. Washington, D.C.: ICS Publications.
Stein, Edith. (2000). Philosophy of Psychology and the Humanities, translated by M.C. Baseheart and M. Sawicki. Washington, D.C.: ICS Publications.
Svensson, Lennart. (1997). “Theoretical foundations of phenomenography.” Higher Education Research & Development, 16.2: 159-171.
Tasson, Biagio G. (2012). From Psychology to Phenomenology. New York: Palgrave Macmillan.
Van Kaam, Adrian. (1966). Existential Foundations of Psychology. Pittsburgh: Duquesne University Press.
Van Kaam, Adrian. (1958). “Assumptions in Psychology,” Journal of Individual Psychology, 14 (1), 22-28.
Varela, Francisco J. Thompson, Evan T. Rosch, Eleanor. (1992). The Embodied Mind: Cognitive Science Human Experience. Cambridge, MA: MIT Press.
Wurzer, Wilhelm. (1983). “Nietzsche’s Hermeneutic of Redlichkeit.” Journal of the British Society for Phenomenology, 14 (3), 258-269.
Zahavi, Dan. (2002). Merleau-Ponty on Husserl: A Reappraisal. In T. Toadvine and L. Embree, (Eds.). Merleau-Ponty’s Reading of Husserl. (pp. 3-31). New York: Springer.

Author Information

Frank Scalambrino
Email: FrankLScalambrino@gmail.com
Duquesne University
U. S. A.

Compositionality in Language

Compositionality is a concept in the philosophy of language. A symbolic system is compositional if the meaning of every complex expression E in that system depends on, and depends only on, (i) E’s syntactic structure and (ii) the meanings of E’s simple parts.

If a language is compositional, then the meaning of a sentence S in that language cannot depend directly on the context that sentence is used in or the intentions of the speaker who uses it. So, for example, in compositional languages, the meanings of sentences don’t directly depend on

Things said earlier in the conversation
The beliefs or intentions of the person uttering S
Salient objects and events in the environment at the time that S is uttered
The non-semantic character of S’s simple parts, such as their shape or sound

In compositional languages, the meaning of a sentence S directly depends only on the meanings of the words composing S, and the way those words are syntactically related to one another.

Of course, simple expressions in a compositional language might have meanings that depend on the context or on the intentions of their users, as the referent of the English pronoun ‘she’ can depend on who the speaker intends to be referring to. As such, sentences containing expressions such as ‘she’ will indirectly depend on the intentions of their speakers, because the meaning of the sentence depends on the meanings of its simple parts and the meanings of some of those parts depend on the speaker’s intentions.

Several arguments purport to show that not only is natural language compositional, but that it must be, since we could not have the linguistic abilities we in fact do have, unless the languages we speak are compositional. A commitment to compositionality has driven a large amount of research in the philosophy of language and in linguistics, since it appears to be very difficult to provide adequate compositional treatments of commonplace linguistic constructions. On the other hand, some philosophers have argued that natural language is not compositional, or that compositionality induces no substantive restriction on possible theories of meaning.

This article addresses the different ways compositionality has been understood by philosophers and linguists, and surveys the arguments that natural language is, must be, or should be compositional, as well as the arguments that it isn’t or needn’t be.

Interpretations of Compositionality
Arguments for Compositionality
The Dialectical Role of Compositionality in Philosophy
1. Real Meanings
2. Semantic Theories Purportedly at Odds with Compositionality
Challenges to Compositionality
Conclusion
References and Further Reading

1. Interpretations of Compositionality

a. Syntactic Structure

In natural languages (such as English, Cantonese, and Kalaallisut), the smallest meaningful symbols are called “morphemes.” For highly analytic languages such as English, there is a large overlap between morphemes and words: words are largely the smallest meaningful units. English does have a number of morphemes that are not words, however, such as the plural ending –s for nouns, the possessive ending –’s for noun phrases, and the 3^rd person singular ending –s for verbs. These are “bound” morphemes, in that they cannot grammatically occur on their own. In other, more synthetic languages such as Kalaallisut, single words can be made of many meaningful parts. The word atuartariaqalirpuq (“he began to have to study”) contains six morphemes, and can be used by itself as a sentence (example from Bittner 1995).

Morphology is the set of rules governing how morphemes are combined to form words; syntax is the set of rules governing how words are combined to form phrases and, ultimately, sentences. These rules describe (among other things) how smaller parts, the constituents, are put together to form larger units. The syntactic rules that formed an expression can affect its meaning. Consider the expression ‘large horse painting’: it can either mean painting of a large horse or large painting of a horse, depending on whether ‘large’ is modifying ‘horse painting’ or just ‘horse.’

The principal claim regarding compositionality that philosophers have been concerned with is the claim that all actual and possible natural languages are compositional. A natural language is a language that humans learn to speak naturally, as part of their development, as opposed to an artificial language such as computer languages. In this context, the claim that natural languages are compositional amounts to the claim that the meanings of complex (multi-morphemic) expressions are determined by and only by (i) the ways their morphemes are put together by the morphosyntactic rules of the language and (ii) the meanings of those morphemes.

This may seem like a clear statement of a single thesis, but unfortunately there is wide philosophical disagreement concerning (a) what meanings are and (b) how we should understand ‘dependence’ in the statement of compositionality. We turn now to these two issues.

b. Meaning

There are two ways in which there are a wide variety of meanings of ‘meaning.’ First, many different philosophers will use the word ‘meaning’ and understand by it various distinct things. Some will think meanings are conceptual roles; others that they are set-theoretic objects and functions. Second, one and the same philosopher may recognize several types or dimensions of meaning. She may think, for example, that connotations are meanings in one sense, and that denotations are meanings in a different sense. In discussing compositionality, a reasonable stance is to consider all proposed types of meanings as bona fide meanings and therefore understand that there are numerous compositionality theses. For example:

Compositionality of stereotype: the stereotype associated with a complex expression E in a natural language is determined by (and only by) (i) E’s morphosyntactic structure and (ii) the stereotypes associated with E’s morphemes.

Compositionality of semantic features: the semantic features (e.g. [+male] or [+animate], as they attach to ‘he’ and ‘who,’ respectively) of a complex expression E in a natural language is determined by (and only by) (i) E’s morphosyntactic structure and (ii) the semantic features of E’s morphemes.

It goes like this for each possible type or dimension of meaning. The philosophical question is which, if any, of these theses is true. Any argument for or against compositionality should make it clear what conception of meaning it takes to be or not to be compositional. It is quite possible that there are several legitimate conceptions of meaning, each deserving the name ‘meaning,’ where based on some of those conceptions, natural languages are compositional, and based on other of those conceptions, they are not.

The question that has perhaps most concerned philosophers interested in compositionality is whether the truth-conditions of a sentence depend on (and only on) its syntax and the meanings of its simple parts. The truth-conditions of a sentence are simply the conditions under which the sentence is true. The truth-conditions of a sentence do not depend only on its syntax and the meanings of its simple parts if that sentence is true in some conditions and false in others, even though it has the same syntax and the same assignment of meanings to its simple parts. For example, we will later consider sentences such as ‘It is midnight.’ Sometimes this sentence is true, but other times—apparently without a change in the meanings of the words or in the way they are combined—it is false. This is an apparent violation of the compositionality of truth-conditions.

c. Dependence

Dependence and determination are common and vital notions in philosophy, though they are in many ways ambiguous. Sometimes dependence is a functional notion, as in: “the signs of two numbers determine the sign of their product (the sign of their product depends on their signs).” Dependence can also be a causal notion, as in: “the success of our movie depended on our advertising campaign.” It can be a constitutive notion, as in: “whether I win depends on whether I get a card lower than 4.” Regarding the compositionality thesis, there are many ways the notion of dependence has been understood.

i. Functional Dependence

One way of understanding the sense in which the meaning of the whole, according to compositionality, “depends on” the meanings of the parts, and the way those parts are combined, is reading “depends on” as “is a function of.” That is, a symbolic system is compositional if, and only if, the meaning of each complex expression E in that system is a function of (a) E’s syntactic structure and (b) the meanings of E’s simple parts.

A function is a pairing of an input (an element of its domain) with an output (an element of its range). Familiar functions from mathematics are addition, subtraction, and multiplication. For example, the addition function takes two inputs and returns as output their sum: + takes 2 and 3 as inputs and returns 5 as output. The important thing about functions is that for any sequence of inputs there can only be one output. + never takes two numbers and returns both 5 and 7 as outputs. An example of a mathematical operation that is not a function is √x, because, for instance, √4 has two values, +2 and –2.

While we usually talk about functions only in the context of mathematics, common functions are all around us. Consider the function “(biological) mother of”. The inputs to this function are organisms and the outputs are their (biological) mothers. “(Biological) mother of” is a function because it pairs inputs with outputs and it never pairs the same input with distinct outputs (everyone has only one biological mother).

To say that the meaning of an expression E is a function of its syntactic structure and the meanings of its simple parts is to say that there is a function that takes E’s syntactic structure and the meaning of E’s simple parts as input, and returns as output E’s meaning.

ii. The Substitution Principle

If a language L is compositional in the functional sense described in the previous section, then that language satisfies the substitution principle:

SP: If you take any expression E of L, and any morpheme M that occurs in E, and you replace M with a different morpheme M* of L that has the same meaning as M, then the result will have the same meaning as M.

For example, “Sally perspires” is an expression of English. Let’s assume that ‘perspires’ and ‘sweats’ have the same meaning. Then what SP says is that “Sally sweats” has the same meaning as “Sally perspires.” In other words, substituting an expression with one meaning for another expression with the same meaning does not change the meaning of the whole.

If compositionality is true, then SP is true. Remember that a language is compositional when there is a function that, for every expression E in the language, takes E’s syntactic structure and the meaning of E’s simple parts as input, and returns as output E’s meaning. If in expression E, you replace one of E’s morphemes M with another morpheme M* that has the same meaning as M, then you haven’t changed the inputs to the function: the function takes the meanings of the parts as inputs, and though you’ve changed the parts, they still have the same meaning. Since functions always return the same output when given the same input, the meaning of E-with-M*-replacing-M must be the same as the meaning of E-with-M.

It is also true that if a husserlian language satisfies of the substitution principle, then the language is compositional in the functional sense (see [9]). A language is husserlian if one synonym can be substituted for another synonym without changing the grammaticality of the result. For example, no husserlian language can have synonyms ‘likely’ and ‘probable’ where:

‘It is likely that the Spurs will win’ is grammatical.

‘It is probable that the Spurs will win’ is grammatical.

But:

‘The Spurs are likely to win’ is grammatical.

‘The Spurs are probable to win’ is ungrammatical.

So long as all such pairs as ‘likely’ and ‘probable’ here are assigned different meanings, the substitution principle and the functional conception of compositionality are equivalent.

iii. Problems for Functional Dependence

While the functional conception of compositionality is easy to characterize and understand, it fails to capture the full force of the constraint many philosophers have thought compositionality imposes upon semantic theories for natural languages. This is because many semantic theories which are not intuitively compositional are compositional in the functional sense.

One way to see this is by noting that any symbolic system that contains no synonyms and assigns exactly one meaning to each expression is compositional in the functional sense. If a symbolic system contains no synonyms, the meaning function for that language can’t treat two expressions differing only in the substitution of synonyms differently (because there are no such expressions). Thus for any expression E of S, there is a function F that takes E’s syntactic structure and the meanings of E’s parts as inputs and returns the meaning of E as output. This entails that a non-compositional language could be made compositional solely by removing a few redundant expressions (synonyms of other expressions in the language).

Second, the functional conception of compositionality does not demand any particular relatedness among the meanings of related expressions. The functional conception requires only that the meaning function not assign different meanings to expressions that differ only in the substitution of synonyms. It does not require that the meanings it does assign to complex expressions be in any natural way related to the meanings of their parts, or to the meanings of other complex expressions composed of similar parts. For example, consider these meaning assignments:

Le chien aboie. → The dog barks.
Le chat aboie. → The cat dances.
Le chat pue. → The skunk eats.

Sentences (1) and (2) share a verb, but nothing about their assigned meanings are similar; (2) and (3) share a noun phrase, but again nothing about their assigned meanings is similar. Nevertheless, there exists a function that takes the syntax, and the meanings of the morphemes, of each expression on the left, and maps it to the meaning on the right: it’s displayed in (1)-(3). In fact, any random, unsystematic assignment of meanings to sentences is compatible with the functional conception of compositionality, provided that either there are no synonyms or that sentences that differ only in the substitution of synonyms are assigned the same meaning. This is ‘dependence’ only in the weakest sense of that word.

iv. Dependence as Computability

As we shall see, the principal reason for the belief that natural languages are compositional is that only compositionality can explain how we can figure out the meanings of a large range of novel sentences and expressions, whose meanings we have not specifically learned at any point. Compositionality, construed as computability, says that if you know the syntactic structure of an expression E, and you know the meanings of E’s simple parts, this suffices for you to “work out” the meaning of E: there exists a procedure that you can use, which after a finite number of steps, tells you the meaning of E itself. In other words, the meaning of any expression E is computable from (a) E’s syntactic structure and (b) the meanings of E’s simple parts.

If the meaning of any expression E is computable from E’s syntactic structure and the meanings of E’s simple parts, then it is a function of E’s syntactic structure and the meanings of E’s simple parts. But the converse is not true, for not every function is computable.

While computability imposes some standard of systematicity in meaning assignments, it nevertheless allows more freedom than we might wish. Consider how different programs running on your computer produce wildly different outputs, even given the same sequence of keystrokes. The outputs of the programs are computed from the keystrokes, but they process that information in radically different ways, and produce outputs of radically different characters. The keys used to type the previous sentence in a word processer might result in a complicated series of moves if typed in a fantasy role-playing game. The computability conception of compositionality says that the transition from the syntax of a complex expression and the meanings of its parts to the meaning of that expression must be a function of the syntax and the meanings of the parts, and that it must be rule-governed; but it doesn’t say anything about what the rules are or can be, except that they can be carried out in a finite number of steps and involve no randomness.

v. Dependence as Mereology

The functional and computational conceptions of dependence, with regard to the thesis that natural languages are compositional, are seemingly weaker than the pre-theoretical conception of dependence that occurs in the thesis itself. There is another conception of dependence in the literature that can reasonably be characterized as too strong (though it is not necessarily false that languages are compositional in this sense).

On this conception, the meanings of the parts of a complex expression are literally part of the meaning of that expression. To see how this could be, consider the view that the meaning of a sentence is a structured proposition. The French sentence [[le chien] aboie]—where bracketing indicates syntactic structure—means a structured proposition such as <<the dog> barks>– where the italicized words stand here for the meanings of ‘le,’ ‘chien,’ and ‘aboie,’ respectively. On this view, the meaning of ‘chien,’ for example, is literally a part of the meaning of ‘le chien aboie.’

This notion of dependence is quite strong: the meaning of a complex expression is made out of its syntactic structure and the meanings of its parts. And while many theories of the meanings of complex expressions, such as the theory of structured propositions, validate the principle of compositionality as interpreted with the mereological conception of dependence, it should be clear that this is more than what philosophers normally mean when they say natural languages are compositional.

vi. The Empirical Conception of Dependence

Finally, it’s possible to define compositionality in terms of the role that it plays in explaining certain of our linguistic abilities. In particular, many philosophers have thought that unless the meanings of complex expressions in natural languages depend on (and only on) (a) the syntax of those expressions and (b) the meanings of those expressions’ parts, we would not be able to learn and understand the languages we in fact learn and understand. Thus we can understand “dependence” here as whatever relation in fact obtains between the meaning of a complex expression and that expression’s syntax and the meanings of its parts that in fact explains our ability to learn and understand new expressions whose meanings we have not learned specifically. We know that language is compositional, but it is an empirical question as to just what compositionality consists in.

The empirical conception of compositionality need not be thought of as a competitor to the other conceptions considered above. Instead, it provides a methodological backdrop against which we can evaluate various proposals regarding the sense of “dependence” at the heart of compositionality. As we saw, the functional conception of dependence is ill-favored precisely because it fails to explain our abilities to learn and understand the natural languages we speak. Any proposed account of compositionality not only has to meet certain internal criteria, such as clarity and consistency, but it also has to (a) actually be true of the languages we speak and (b) actually explain our abilities to learn and understand those languages.

There is of course the possibility that no dependence relation that obtains only between the meanings of complex natural language expressions and their syntax and the meanings of their simple parts plays a discernible role in our linguistic abilities. Perhaps the meanings of complex expressions are partly determined by prior discourse, speaker intentions, salient objects and events in the environment, or the non-semantic character of those expressions’ simple parts, such as their shape or sound. In such an event, it might turn out not just that natural languages are not compositional, but that “compositionality” is without application, its introduction having rested on a false presupposition.

2. Arguments for Compositionality

a. Novelty

We are capable of understanding a very large number—perhaps an infinite number—of sentences that we have never heard before. Consider the sentence frame F:

There is a ______ on television.

Anything describable could be written in the blank: orange-and-green polka-dotted squid, shoe sharpener, cauliflower-shaped spacecraft from Saturn…. The first thing to notice is that you would understand each of these sentences, even though presumably you’ve never heard them before and no one has ever taught you the meaning of the specific sentence There is a cauliflower-shaped spacecraft from Saturn on television. There are quite a lot of things that are describable in English, and so quite a lot of sentences that fit frame F. Each English speaker has only heard a tiny fraction of these sentences before, but every English speaker understands all of them (or at least those containing the English words that she knows).

If we understand the meaning of a new sentence whose meaning we haven’t been specifically taught before, it must be that we can work out its meaning from information available to us when we hear that sentence and other things that we have already learned.

Suppose for a moment that English is a compositional language, in the sense that the meaning of a sentence of English can be computed (worked out) from its syntactic structure and the meanings of its morphemes. This would explain how one could understand a novel utterance such as There is a cauliflower-shaped spacecraft from Saturn on television. English speakers who have never learned the meaning of this sentence specifically have nevertheless learned the meanings of each of the words in it: cauliflower, shape, the past tense morpheme –ed, spacecraft, and so forth. Furthermore, part of mastering a language involves acquiring the ability to parse sentences of that language, that is, to figure out their syntactic structure—for example, figuring out that cauliflower-shaped modifies spacecraft, but on television doesn’t modify Saturn. Thus if English is compositional, English speakers have all they need to understand novel English sentences they have never encountered before—provided those sentences don’t contain unfamiliar words.

We can summarize the argument from novelty as follows:

Premise 1. We are capable of understanding a very large number of English sentences that we have never heard before, whose meanings we have not specifically been taught.

Premise 2. If English is compositional, then English speakers have all the abilities and information they need to understand English sentences they have never encountered before.

Conclusion: The best explanation for the facts described in (1) is that English is in fact compositional.

The premises of the argument from novelty are largely uncontroversial. Since the premises are equally true if ‘English’ is replaced by any other natural language, be it ‘Cantonese’ or ‘Kalaallisut’, the argument suggests that all natural languages are compositional.

As with any inference to the best explanation, however, the argument from novelty is only compelling if there aren’t better or equally good explanations for the target phenomenon—in this case, for English speakers’ ability to understand novel English sentences. It is obvious that if we understand the meaning of a new sentence whose meaning we haven’t been specifically taught before, it must be that we can work out its meaning from information available to us when we hear that sentence and other things that we have already learned. But the information available to us is not limited to (i) the sentence’s syntactic structure and (ii) the meanings of its simple parts. When we hear a novel sentence, we also have information about:

Things said earlier in the conversation
The beliefs or intentions of the person uttering S
Salient objects and events in the environment at the time S is uttered
The non-semantic character of S’s simple parts, such as their shape or sound

If the meaning of a complex expression directly depended on any of these things, we could still explain how English speakers can understand novel utterances, because these are things available to speakers and hearers in a conversation. The argument from novelty can’t by itself establish that all natural languages are compositional, and for that reason it is usually offered with additional arguments for compositionality, to which we now turn.

b. Systematicity

It is commonly argued that the systematicity of natural languages provides good reason to suppose languages are compositional. However, most of the literature fails to provide a clear characterization of systematicity and sometimes very distinct phenomena are all crowded under the one heading.

On the most common way of understanding systematicity, language L is systematic if, and only if, for all expressions E₁, E₂, and E₃ in L, if E₁ can syntactically combine with E₂ to form a grammatical sentence, and E₃ is of the same syntactic category as E₂, then E₁ can combine with E₃ to form a grammatical sentence. For example, the English expression ‘Fred’ can combine with the expression ‘eats bananas’ to form the grammatical sentence ‘Fred eats bananas.’ Since ‘George’ is of the same syntactic category as ‘Fred’ (proper names), if English is systematic then we expect that ‘George eats bananas’ is also a grammatical sentence. Since it is, and since examples such as this are easy to come by, it is often assumed by philosophers that English and other natural languages are systematic, in this sense.

There are reasons to think that English and other natural languages are not systematic in this sense. For example, so-defined, a language is systematic only if its syntactic rules contain no semantic or phonological constraints: it says that any expression can be substituted for any other expression of the same syntactic category, regardless of differences in meaning/ phonology between the two expressions.

Whether a language is systematic, in the sense just discussed, is not obviously relevant to whether it is compositional. After all, systematicity in that sense is only a constraint on which sentences must be grammatical if certain other sentences are grammatical. A language being systematic in that sense is compatible with that language having a non-compositional meaning function.

There is, however, another sense of systematicity that is more difficult to precisely characterize, but which is in fact relevant to whether languages are compositional. Consider these two claims about English: For English expressions E₁, E₂, E₃, and E₄, when the following conditions are met:

E₁ can combine with E₂ to form a grammatical sentence [E₁ E₂].

Example: ‘Dogs’ can combine with ‘chase cars’ to form the sentence ‘Dogs chase cars.’

E₃ can combine with E4 to form a grammatical sentence [E₃ E₄].

Example ‘Cats’ can combine with ‘eat mice’ to form the sentence ‘Cats eat mice.’

E₁ is of the same grammatical category as E₃.
E₂ is of the same grammatical category as E₄.

Then the following two claims hold:

Claim 1: Anyone who can understand [E₁ E₂] and [E₃ E₄] can also understand [E₁ E₄] and [E₃ E₂], when the latter are well-formed.

Example: Anyone who can understand ‘dogs chase cars’ and ‘cats eat mice’ can also understand ‘dogs eat mice’ and ‘cats chase cars.’

Claim 2: The meanings of [E₁ E₂] and [E₃ E₄] are predictably related to the meanings of [E₁ E₄] and [E₃ E₂], when the latter are well-formed.

Example: ‘dogs chase cars’ has a meaning that is predictably related to both ‘dogs eat mice’ and ‘cats chase cars.’

It can be argued that any language that is like English in this way is most likely a compositional language. The argument runs as follows. If English is compositional, then understanding ‘dogs chase cars’ and ‘cats eat mice’ involves (a) knowing the meanings of all the morphemes in the two sentences and (b) being able to recognize the syntactic structure of both sentences. Furthermore, if English is compositional, such knowledge and abilities suffice to understand ‘dogs eat mice’ and ‘cats chase cars.’ For these sentences are composed of the same morphemes, put together in the same syntactic structures. Thus the best explanation for why Claim 1 is true of English is that English is in fact compositional.

A similar argument can be built around Claim 2. If English is compositional, then the meanings of English expressions are completely determined by (a) their syntactic structure and (b) the meanings of their morphemes. Since the expressions ‘dogs chase cars’ and ‘dogs eat mice’ partially overlap in their morphemes, they partially overlap in what determines their meanings, if compositionality is true. Thus the fact that they have related meanings is some evidence that English is in fact compositional.

Neither of these arguments is very strong on its own, though each may be combined with other arguments or evidence for compositionality to marshal a stronger case. First, it can be argued that Claim 1 and Claim 2 are not true of all English expressions E₁, E₂, E₃, and E₄. With regard to Claim 1, someone might, for instance, know what ‘natural disaster’ and ‘wine selection’ mean without knowing what ‘natural selection’ means. This is because, in particular, the meaning of ‘natural selection’ is not wholly predictable from the meanings of ‘natural’ and ‘selection.’ Finally, both arguments are inferences to the best explanation: they claim, respectively, that the compositionality of English best explains Claim 1, and that it best explains Claim 2. However, there are non-compositional meaning functions that also predict Claims 1 and 2. For example, if the meaning of a complex expression is a function of the meanings of its parts and the phonetic properties of its parts, then it would be no surprise, for instance, that sentences with overlapping morphemes had overlapping meanings. Thus whether compositionality is the best explanation for these claims may depend on what other independent reasons we have for accepting that English is compositional.

c. The Inductive Argument

A third argument for compositionality is predicated on (a) the apparent compositionality of a wide variety of linguistic phenomena and (b) the success of compositional semantics in compositionally analyzing apparently non-compositional linguistic phenomena.

Consider a simple English sentence: ‘Jenny loves baseball.’ Even without a well-defined notion of dependence, it is difficult to see how the meaning of this sentence depends on anything other than the meanings of ‘Jenny,’ ‘loves,’ and ‘baseball,’ and the way those words are syntactically combined. External features such as the intentions of a speaker using the sentence on a particular occasion, and the context in which the sentence is used, may well affect what gets implicated by the sentence, but don’t apparently affect its literal meaning. Furthermore, formal features of the sentence, such as the fact that each of the words it contains has two syllables, are also apparently irrelevant to its literal meaning. The meaning of ‘Jenny loves baseball’ apparently depends on, and only on, (a) its syntax and (b) the meanings of its simple parts. This sentence, and a large portion of the language we speak, is apparently compositional.

Now consider a different example: ‘Every girl loves some sport.’ This sentence has two meanings. First, it can mean that for each girl, there is some sport she loves—even if for different girls it’s different sports. For example, if Jenny and Liz are the only girls, the sentence will be true if Jenny loves baseball and no other sport and Liz loves hockey and no other sport. Second, it can mean that there is one particular sport that every girl loves. If Jenny loves only baseball and Liz only hockey, then the sentence is false, because there is no sport loved by all girls. This sentence is therefore apparently non-compositional. On every occasion of use, the sentence appears to have one and the same syntactic structure, and its parts all appear to have the same meanings. If compositionality were true, then, the sentence couldn’t have different meanings on different occasions, because what determines its meaning is the same on all occasions. And yet, it apparently does have different meanings on different occasions.

This is not an argument against the compositionality of English, but rather one for it. The second half of the inductive argument for compositionality concedes that there are indeed a great many apparently non-compositional linguistic phenomena in English—this quantifier scope case being just one among them. However, the argument continues, a rather large subset of the great many apparently non-compositional phenomena have been considered by linguists in the past several decades and been given satisfactory compositional analyses. (With regard to our example, the most common solution has been to regard it as really having two syntactic structures, corresponding to its two meanings. See the References and Further Reading.) Since compositional semantics has been such a fruitful and successful research program in the past and there’s no reason to think it will cease to be in the future, we have strong reason to suppose that English is in fact compositional, even if some of it appears not to be.

The inductive argument holds up the past successes of compositional semantics as a good reason to believe that English (and any other language we’ve seriously and successfully investigated) is compositional. However, there remain apparently non-compositional linguistic phenomena that have not been given universally agreed upon—or even widely endorsed—compositional analyses (see section 4, Challenges to Compositionality). Some of these cases, such as generic statements, may well have particular features that justify us in thinking that they cannot be given compositional analyses.

One additional point is worth making. A common construal of compositional semantics in linguistics is that the goal is to assign logical forms (LFs) to sentences of natural language in a compositional way. LFs are themselves representations and are not (standardly considered) the same things as meanings. LFs are “in the head,” unlike propositions, states of affairs, situations, truth-conditions, and so forth. Thus, the fact that an LF can be compositionally determined from the (a) syntactic structure of a sentence and (b) the lexical entries for that sentence’s morphemes does not entail that the meaning of the sentence is determined by those things—at least not without further argumentation. Thus the past success of semantic theory could be irrelevant to the question whether natural languages are compositional.

3. The Dialectical Role of Compositionality in Philosophy

a. Real Meanings

Section 1.b endorsed a sort of meaning pluralism, that all proposed meanings (stereotypes, features, referents, senses…) were bona fide meanings and that it made sense to ask whether meaning was compositional, in any of the bona fide senses. But compositionality can also be used as a litmus test for determining which of these meanings is important or relevant to philosophical theorizing, as follows:

X is the Real Meaning of expression E =_df. Understanding E requires pairing it with X.

The Real Meaning of an expression is the meaning whose grasp is both necessary and sufficient for understanding that expression. This notion of Real Meaning can then be used to discredit various meanings that are not compositional, as follows. As the argument from novelty suggests, our ability to understand new sentences whose meanings we have not specifically learned, requires that we compute those meanings from the sentences’ syntactic structures and the meanings of their parts. Thus, the Real Meaning of complex expressions in English must be compositionally determined. Therefore, if Y-meanings are not compositionally determined, then Y-meanings aren’t Real Meanings.

b. Semantic Theories Purportedly at Odds with Compositionality

The principle of compositionality has been employed in arguments against almost every semantic theory, including theories in metaethics of the meaning of normative terms. Presented here are four illustrative examples: first, Frege’s puzzle for the “naïve theory” of meaning of names, that names mean what they name; second, two very standard cases of discrediting theories (in this case, conceptual role semantics and verificationism) with the principle of compositionality; finally, the Frege-Geach problem for non-cognitivist theories in metaethics. Other examples can be found in References and Further Reading.

i. Direct Reference Theory

According to the “naïve theory” of the meaning of proper names (often also called the direct reference theory) the meaning of a name is its referent, the thing it names. If the direct reference theory is true and compositionality is true, it follows that two sentences that differ only in the substitution of one co-referring name for another will mean the same thing. For example, sentences (a) and (b) will mean the same thing, because “Lady Gaga” and “Stefani Germanotta” both refer to the same person:

(a) Lady Gaga is a professional singer.

(b) Stefani Germanotta is a professional singer.

This seems like a reasonable position. Whenever (a) is true, (b) is also true, and vice versa. So (a) and (b) have the same truth-conditions, and it’s reasonable to then think they have the same meaning. But now consider two other sentences that are like (a) and (b) in that they differ only in the substitution of one co-referring name for another:

(d) Elaine expects to see Stefani Germanotta.

Since Lady Gaga is Stefani Germanotta, the direct reference theory (plus compositionality) predicts that (c) and (d) have the same meaning. But prima facie, it seems that (c) could be true and (d) false, or (d) true and (c) false. Elaine may have heard Lady Gaga on the radio, and purchased a ticket to her concert, completely oblivious to the fact that Lady Gaga is Stefani Germanotta. She expects to see Gaga, but would be very surprised to learn she was to see Germanotta. She might even become angry at learning that Germanotta will be performing all night, because she prefers to see Gaga.

It follows that three things are inconsistent: (i) our naïve judgments regarding the truth-conditions of (c) and (d); (ii) the direct reference theory; and (iii) the thesis that English is compositional. This is called “Frege’s Puzzle” after Gottlob Frege, who first posed it. Some philosophers have taken it as a reason to reject the direct reference theory.

ii. Conceptual Role Semantics

According to the inferentialist, the meaning of a simple sentence of the form x is an F is the set of sentences we can infer are (probably) true, assuming x is an F. For example, the meaning of “This is a tree” would be a set of sentences containing things such as “This has leaves,” “This is a plant,” “This has branches,” “This grows,” “This is relatively stationary,” and so forth. The inferentialist further holds that the meaning of a complex sentence is also the set of sentences we can infer are (probably) true from it. This is a variety of conceptual role semantics.

Now consider the sentence “This is a green fish.” Green fish are relatively uncommon, so plausibly you can infer “This is rare” from “This is a green fish,” and thus according to conceptual role semantics “This is rare” is an element of the meaning of “This is a green fish.” However, neither green things nor fish are uncommon in nature. So “This is rare” is not an element of the meaning of either “This is green” or “This is a fish.”

This is just one example of the broader principle that the normal features of things that are F and G are not a function of the normal features of things that are F and the normal features of things that are G. Thus, the set of sentences expressing the normal features of things that are F and G will not be a function of the set of sentences expressing the normal features of things that are F and the set of sentences expressing the normal features of things that are G. That is, this version of conceptual role semantics is incompatible with compositionality.

iii. Verificationism

Compositionality presents analogous troubles for theories that are similar to conceptual role semantics, such as the theory that the meaning of a sentence is the set of experiences that confirm it or the theory that the meaning of an expression is a stereotype. Suppose that the meaning of a sentence S is the set of experiences E such that E raises the probability that S is true.

For the sake of the example, suppose that cows comprise a tiny proportion of the dangerous animals, and that brown animals also comprise a tiny proportion of the dangerous animals. Further, all dangerous cows are brown and all dangerous brown animals are cows. Now suppose you encounter one and only one animal and experience E an animal-mauling. E lowers the probability that the animal was brown, because most dangerous animals are not-brown. E lowers the probability that the animal was a cow, because most dangerous animals are non-cows. But E raises the probability that the animal was a brown cow.

The set of experiences that confirms this is a brown cow is not a function of the set of experiences that confirms this is a brown thing and the set of experiences that confirms this is a cow. Thus verificationism is incompatible with compositionality.

iv. Moral Non-Cognitivism

According to the expressivist, sentences involving normative terminology such as ‘good’ and ‘bad’ and ‘right’ and ‘wrong’ play a different role in communication than ordinary descriptive sentences, containing no such terminology. For example, when George says something descriptive, such as “figure-skating is difficult,” he is expressing his belief that figure-skating is difficult. The role of descriptive statements is to express one’s beliefs. But, according to the expressivist, the role of normative terminology is to express one’s approval or disapproval. When George says something normative, such as “figure-skating is right” or “figure skating is wrong,” he is expressing his approval or disapproval of figure-skating.

Consider the sentence, “figure-skating is not wrong.” What does this sentence express? It’s not disapproval of figure-skating, obviously, because that’s what the expressivist thinks “figure-skating is wrong” means. But neither is it approval of figure-skating. You can think something is not wrong without thinking that it is right—figure-skating, for instance, is neither right nor wrong. It is morally neutral; it is morally permissible. Expressivist accounts then say that “figure-skating is not wrong” expresses the speaker’s toleration of figure-skating.

This treatment raises a question: Does the expressivist meaning of “figure-skating is not wrong” depend on and only on the expressivist meaning of “figure-skating is wrong” and the meaning of “not”? At first glance, it would seem that the answer is “no.” According to the expressivist, when George says “figure-skating is wrong” what this expresses is DIS:

DIS. George disapproves of figure skating.

So when George says instead, “figure-skating is not wrong,” this should express something that is a combination of DIS and the meaning of “not.” Two options suggest themselves:

~DIS. George does not disapprove of figure-skating.

DIS~. George disapproves of not figure-skating.

But neither ~DIS nor DIS~ says the same thing as George tolerates figure-skating, which is the meaning of “figure-skating is not wrong,” according to the expressivist. ~DIS is consistent with George having no opinion regarding figure-skating. But tolerating figure-skating—thinking that it is not wrong, that it is an acceptable form of behavior—is having an opinion of figure-skating. It’s having the opposite opinion to one who thinks figure-skating is wrong. DIS~ is also not the meaning the expressivist wants. Tolerating figure-skating is not the same thing as disapproving of those who don’t skate. You can tolerate a behavior without being intolerant of those who don’t engage in it.

This is “the negation problem” for expressivism but it is just part of a broader set of problems for moral non-cognitivist theories in meta-ethics. The broader set of problems—often called the Frege-Geach problem—regards how non-cognitivist theories can deal with logically complex normative sentences (involving words such as “not,” “or,” and “if… then…”) and logical inferences.

4. Challenges to Compositionality

There is no end of linguistic phenomena that have been presented as challenges to the thesis that natural languages are compositional. The examples that follow are therefore intended to illustrate the sorts of problems the compositionality thesis faces, rather than constitute an exhaustive overview.

Section 4a considers an attempt to undermine the dialectical purpose of compositionality by showing that any meaning theory is compatible with the principle of compositionality. Section 4b focuses on context-sensitive expressions. Here Kaplan’s distinction between character and content is introduced as well as the strategy of handling apparently non-compositional phenomena by positing so-called “hidden indexicals.” The key idea introduced in this section is that while compositionality requires that the meanings of complex expressions depend only on their syntactic structure and the meanings of their morphemes, it allows simple expression meanings to depend on anything, including context, speaker intentions, and so on.

Section 4c covers the case of idioms. Although there are plenty of non-compositional idioms, this is not as devastating to the compositionality supporter as one might think. The key idea in 4c is that allowing exceptions to the principle of compositionality in cases where we have specifically learned the meaning of a complex expression doesn’t hurt the dialectical purposes that principle is mainly used for. A real problem for compositionality would be a large number of cases where we are able to understand complex expressions we have never heard before and those expressions are not compositional. Section 4d covers a productive construction in English that seems to suggest just such a problem for compositionality: noun modification.

a. The Triviality Objection

Consider the following argument: the debate over whether natural languages are compositional is pointless. Any language can be given a compositional semantics, for any proposed theory of what meanings are. If meanings are ideas, then we let the meaning of [dogs [chase cats]] be [the idea of dogs [the idea of chasing, the idea of cats]]. If meanings are stereotypes, then we let the meaning of [dogs [chase cats]] be [the stereotype of dogs [the stereotype of chasing, the stereotype of cats]], and so on. In general, the meaning of any complex expression is just that very expression, with the meanings of its simple parts in place of those parts. (This is a type of structured propositions view.)

There are two main reasons the triviality objection fails to convince most philosophers. First, while one can give such meaning theories for complex expressions, these meaning theories conflict with other principles that seem reasonable to hold. For example, we might think that the meaning of ‘cow’ and the meaning of ‘brown cow’ should be the same general type of thing. If the meaning of ‘cow’ is an idea, the meaning of ‘brown cow’ should also be an idea; if the meaning of ‘cow’ is a property—such as the property of being a cow—then the meaning of ‘brown cow’ should also be a property—such as the property of being a brown cow. But according to the triviality objection, we must say instead that while ‘cow’ means the idea of a cow, ‘brown cow’ means a structured complex containing two ideas: the idea of brown and the idea of a cow.

Second, even if structured propositions don’t violate any of our other commitments, most structured propositionalists believe that the structured proposition that is the meaning of a sentence determines the truth-conditions of that sentence. And it is far from obvious that one can work out the truth-conditions of ‘this is my pet fish’ from a structured proposition containing the stereotype of a pet and the stereotype of a fish. It is not a trivial question to ask whether the truth-conditions of a sentence depend on (and only on) that sentence’s syntax and the meanings of its simple parts.

b. Context-Sensitive Expressions

Consider the sentence ‘I am Socrates.’ Sometimes when the sentence is uttered, it is true; at other times it is false. Although we might try to defend the claim that true utterances of ‘I am Socrates’ have a different syntactic structure from false utterances of ‘I am Socrates,’ this seems wholly implausible. Clearly the truth or falsity of the sentence depends on who is saying the sentence.

At first, this might seem like proof that the truth-conditions of English sentences are not determined compositionally. Here is the argument: suppose that Aristotle says, ‘I am Socrates.’ This sentence is false because its truth-value depends on who says it: it is true only if the person who says it is Socrates. However, Aristotle is not the meaning of ‘am’ or ‘Socrates,’ as anyone can tell. Aristotle is also not the meaning of ‘I,’ otherwise when Socrates says ‘I am Socrates’ he would mean ‘Aristotle is Socrates.’ So the truth-value of ‘I am Socrates’ depends on something that is not its syntactic structure and is not the meanings of any of the words comprising it. And it doesn’t help to say that ‘I’ means ‘the person saying this sentence,’ because now we are faced with the exact same problem: sometimes ‘The person saying this sentence is Socrates’ is true and sometimes it is false. But it has the same syntactic structure and its morphemes mean the same thing on both the true occasions of utterance and the false ones.

Now we can unravel what’s going on here. There is one sense in which ‘I’ has the same meaning every time it is used. We can call this the character of ‘I.’ There is another sense in which ‘I’ has a different meaning when different people use it. Call this the content of ‘I.’ Character is a rule for determining content. The rule for ‘I’ is: the content of ‘I’ any time it is used is the person who is using it. So when Aristotle and Socrates both use the word ‘I’ it has different contents for each use—Aristotle and Socrates, respectively—but those contents are determined by one and the same character (rule). The truth of ‘I am Socrates,’ when used by any particular person, is completely determined by (and only by) the syntax of the sentence and the contents of its morphemes.

English has a variety of expressions that differ in content from context to context. We call these context-sensitive expressions:

Now, today, yesterday, tomorrow
Here, there, local, nearby
I, you, he, she, it, they, we
Come, go, left, right
This, that, these, those
Thus, so, yea

Some of these have characters that determine their contents with no interpretation necessary. ‘Today’ always names the day on which it is used. The rule for ‘that,’ however, is roughly that its content is whatever the speaker intends.

The general point here is that compositionality requires that the meaning of a complex expression not be determined ‘directly’ by context or by speaker intentions. However, a language can still be compositional if its simple expressions have their meanings (contents) determined by context or by speaker intentions.

Some philosophers have proposed compositional analyses of various apparently non-compositional phenomena that appeal to unwritten, unspoken context-sensitive expressions (“hidden indexicals”). For example, consider the sentence, ‘There is no beer.’ It might mean on different occasions: there is no beer on this menu; there is no beer at this party; there is no beer in this bottle, and so on. This could be because the sentence ‘There is no beer’ has its meaning determined by factors other than the meanings of its parts and the way they are combined. Alternatively, it could be because there is a hidden indexical ‘there’ that is really part of the sentence. The indexical, though present, is not written or spoken. Nevertheless, it contributes its context-sensitive content to the meaning of the sentence, thus accounting for the variability in the sentence’s truth-value from context to context. There is nothing theoretically problematic about such a hidden indexical account, but it should be emphasized that whether hidden indexicals exist in these cases is an empirical question that might turn out to be false.

c. Idioms

The term ‘idiom’ covers a wide range of expressions, including stale metaphors (she’s on the fence, he ran out of steam), common hyperboles (he drinks like a fish, there was no room to swing a cat), and even common phrases (she’s last but not least, there’s method to his madness). To the extent that we don’t think metaphor or hyperbole pose any trouble for the thesis that natural languages are compositional these types of idioms appear equally benign.

However, there are some idioms whose meanings cannot be worked out by someone familiar only with their syntax and the meanings of their parts and whose meanings can’t be understood as implicatures. Consider idioms such as she let the cat out of the bag, or I think he’s pulling your leg. Understanding these complex expressions requires learning their meanings in advance, separate from the meanings of their parts. In fact, many idioms contain ‘words’ that do not otherwise occur in the language, or only occur with different meanings (that’s beyond the pale, this is an old wives’ tale).

It is not uncommon for philosophers to assert that compositionality admits of finitely many exceptions, and as there are only finitely many idioms in any language, compositionality is not violated. This is not strictly speaking true. The most general formulation of compositionality—the meaning of any complex expression depends on and only on its syntax and the meanings of its parts—admits of no exceptions, nor do many of its various precisifications—for example, reading ‘depends on’ as ‘is a function of,’ or ‘can be computed from.’

On the assumption that ‘kick the bucket’ has the same syntax, and simple parts with the same meanings, in both its idiomatic and its non-idiomatic meaning, its meaning is not a function of its syntax and the meanings of its simple parts, for functions have unique outputs. The substitution test fails: ‘kicked the pail’ does not have the same meaning as the idiomatic ‘kicked the bucket,’ despite having the same syntax and parts with the same meanings. In a more intuitive sense, the meaning of ‘kicked the bucket’ doesn’t depend on the meanings of ‘kick’ and ‘bucket’—those meanings, the act of kicking and bucket are neither here nor there with respect to the idiomatic meaning of ‘kick the bucket.’

Here is what motivates the common refrain that “compositionality admits of finitely many exceptions.” Recall that the argument from novelty says that the best explanation for our ability to understand complex expressions whose meanings we have not been specifically taught is that those expressions have their meanings determined compositionally. The argument from novelty is irrelevant to complex expressions whose meanings we have been specifically taught. This includes the problematic idioms. No one understands “she let the cat out of the bag” or “he’s just pulling your leg” before they have been taught the specific meaning of those idioms. What the argument from novelty suggests is that new complex expressions must be composed only of expressions whose meanings we have learned specifically before—but these latter expressions can be simple like “dog” or complex like “let the cat out of the bag.”

While idioms may demonstrate that not all complex expressions have their meanings determined compositionally, it is important to note that compositionality may still serve its dialectical role. The argument from novelty shows that sentences we can understand without having learned their meaning specifically must have meanings that depend on parts whose meanings we have learned specifically. Thus we still have reason to doubt that the Real Meaning of “this is a green fish” is its inferential role, because (i) “this is a green fish” is the sort of sentence English speakers can understand without having learned its meaning specifically (unlike, for instance, “she let the cat out of the bag”) and (ii) as we’ve seen, the inferential role of “this is a green fish” does not depend on the inferential roles of “this is green” and “this is a fish.”

Nevertheless, idioms could still pose a threat to the claim that novel expressions are compositional, if it turns out there are non-compositional idioms we can understand, even though we have not been specifically taught their meanings. For example, consider the class of expressions that involve a VERB + the removal of relatively irremovable things to mean something like VERB-ed excessively: she cried her eyes out/ laughed her head off/ worked her butt off/ danced the night away… It might be that we can recognize novel instances of patterns like this, in ways that don’t involve calculating their meanings from the meanings of their parts. How exactly we process the meanings of sentences containing idioms is as of now an open question, and it might turn out that we speak a language that violates the principle of compositionality even for novel expressions.

d. Noun Modification

English nouns can be combined with other English nouns to form compound nouns—for example, ‘truck driver,’ ‘panda trainer,’ ‘demolition derby,’ and so forth. This process is productive: ‘You are reading the compositionality philosophy encyclopedia entry compounds section’ (the section on compounds from the entry in the encyclopedia of philosophy about compositionality).

One interesting aspect of noun compounds in English is that they do not specify the relation between the two nouns, and this relation differs from occasion to occasion. A house boat, for example, is a boat used as a house; but a boat house is not a house used as a boat, it’s a house for your boat to live in. A dog house is a house for a dog to live in, but a house dog is not a dog for a house to live in, nor is it a dog used as a house, it’s a dog that lives exclusively in the house. (Still more relations abound: brick house, house appraisal, house party…)

While we might treat many compounds simply as idioms there are two general additional problems they pose: their productivity, as stated, and also the fact that nonce or novel compounds are regularly understood. Consider these examples:

Example 1: We are at a child’s birthday party, about to eat ice cream. There are several spoons, each of which has a different animal depicted on it. I tell you, “You can have the dog spoon.” You immediately recognize that I mean the spoon with a dog depiction on it.

Example 2: Similar birthday party scenario. This time there are only normal spoons. Unfortunately, there are only as many spoons as guests, and the dogs at the party have gotten ahold of one of them and slobbered all over it. I tell you, “Sorry, there’s no ice cream for you, unless you want the dog spoon.” You immediately recognize that I mean the spoon that the dogs have been playing with.

Example 3: You and I are shopping for a friend who likes to collect spoons. We find some very nice Chinese commemorative spoons from different years. With the background knowledge that our friend was born in the year of the dog, and that only one spoon is from the year of the dog, I say “Let’s get the dog spoon.” You immediately recognize that I mean the spoon that commemorates a year that is also a year of the dog in the Chinese zodiac.

In each of the examples, ‘dog’ means the same thing it always does, because ‘dog’ is not an indexical such as ‘I’ or ‘today’ and does not have different contents on different occasions. Similarly, in each of these examples, ‘spoon’ means the same thing it always does, because ‘spoon’ is not an indexical either. These two words exhaust the morphemes in the expression ‘dog spoon.’ Furthermore, in each of the examples, the syntax of ‘dog spoon’ is the same. And yet, in each of the examples, the meaning of ‘dog spoon’ is different. These facts, if they are facts, are straightforwardly incompatible with the claim that the meaning of ‘dog spoon’ depends on and only on its syntax and the meanings of its morphemes. These examples seem to show that the meaning of ‘dog spoon’ is context-sensitive because it directly depends on context, not because its parts are context-sensitive.

Similar remarks can be made for the English possessive “Heather’s horse”: in separate contexts it can mean: the horse that Heather owns; the horse that Heather has wagered money on; the horse that Heather is currently riding; the horse that shares a name with Heather, and so on. If ‘Heather’, ‘horse,’ and the English possessive morpheme ‘-s’ don’t change their meanings from context to context, then it appears that the meaning of ‘Heather’s horse’ depends directly on context, and is thus not compositional.

Indeed, modification in English generally allows context-specific interpretations: ‘green leaf” in different contexts could mean a leaf that is green on the outside, a leaf that is green on the inside, a leaf that is normally (but not now) green, a leaf depicted in the green volume of a color-coded set of volumes on leaves, and so on. Again, although ‘green leaf’ is context-sensitive, its parts, ‘green’ and ‘leaf’ do not appear to be. This direct dependence of the meaning of a complex expression on context is a violation of compositionality.

There are various attempts at compositional solutions to the problem posed by compound nouns. There are two general strategies: first, one can deny that ‘dog spoon’ or ‘Heather’s horse’ or ‘green leaf’ differ in meaning from one occasion to the next. Second, one can accept that expressions such as these are context-sensitive, but argue that they do contain context-sensitive parts (for example, hidden indexicals) that explain the context-sensitivity.

As an example of the first strategy, some philosophers and linguists have argued that “dog spoon” means only “spoon somehow related to a dog or dogs.” More generally they say that any noun compound N₁ N₂ means “N₂ somehow related to a N₁ or N₁s.” In this way, noun compounds are assigned fixed, non-context-sensitive meanings that only depend on their syntax and the meanings of their parts. Such accounts have unintuitive consequences, to say the least: every time there is a toilet somehow related to paper, there is paper somehow related to a toilet. But it doesn’t obviously follow that whenever there is toilet paper, there are paper toilets. Furthermore, extending the strategy to possessives looks disastrous: If [N₁ [POS N₂]] means “N₂ somehow related to N₁,” then no matter which horse wins the race, Heather’s horse wins the race, because Heather is somehow related to all of them.

An example of the second strategy is to posit a “hidden indexical.” The idea is that ‘dog spoon’ means ‘spoon that bears relation R to dogs,’ where R is a relation-indexical that picks out different relations in different contexts, in the way ‘he’ picks out different males in different contexts. This strategy requires positing a large number of hidden indexicals: whenever nouns are modified by nouns, possessives, or adjectives. As previously discussed, there is nothing theoretically problematic with such solutions, but whether there are such indexicals in these cases is an empirical matter that may well be shown to be false.

5. Conclusion

The principle of compositionality plays a central role in the evaluation of theories of meaning. If the principle is true, or is true with only a constrained class of exceptions, many if not all current theories of meaning may turn out to be inadequate. This includes a number of popular non-cognitivist positions in metaethics. Despite its centrality, it is difficult to say precisely what the principle of compositionality requires, both because philosophers are divided on what exactly meanings are and because of the nebulousness of “dependence.” Furthermore, there are a number of productive, apparently non-compositional linguistic phenomena. If the principle of compositionality is untrue, we have to find some other way to explain how humans learn and understand productive languages.

6. References and Further Reading

a. General

There are several overviews of compositionality that have distinct focuses from this article. Readers are warned that much of the secondary literature on compositionality is very technical. Item [2] provides a formal framework for studying variants of compositionality and then surveys many such variants; it requires at least rudimentary knowledge of metalogic. Item [3] is a survey of issues concerning compositionality in Montague semantics; readers should have at least some familiarity with formal semantics in the Montagovian tradition.

[1] Dever, J. 2006. “Compositionality.” In E. Lepore & B. Smith (eds.), The Oxford Handbook of Philosophy of Language. Oxford University Press: pp. 633-666.
[2] Pagin, P. & Westerståhl, D. 2009. “Compositionality I: Definitions and Variants.” Philosophy Compass 5.3: pp. 250-264.
[3] Partee, B. 2004. “Chapter 7: Compositionality” In her Compositionality in Formal Semantics: Selected Papers by Barbara Partee. John Wiley & Sons.

b. Frege’s Principle

The principle of compositionality is often called “Frege’s Principle,” because Frege is often considered the source or inspiration for the principle. However, it’s a matter of serious scholarly debate whether Frege did, in fact, hold the principle for either of the two kinds of meaning he recognized (Sinn and Bedeutung, or sense and reference). The curious reader is directed to [4] and [5]. Item [5] argues that while Frege held the principle of compositionality of reference (in the form of the substitution principle), there is no good evidence that he thought senses were likewise compositional. (This article also helpfully contains a wide variety of scholarly articulations of what compositionality is.) [4] argues that Frege did not even hold that the referent of a sentence was determined by its syntactic structure and the referents of its parts, because sentences’ referents vary, according to Frege, in ways that directly depend on context.

[4] Janssen, T. 2001. “Frege, Contextuality and Compositionality” Journal of Logic, Language and Information Vol. 10: pp. 115-136.
[5] Pelletier, F. 2001. “Did Frege Believe Frege’s Principle?” Journal of Logic, Language and Information Vol. 10: pp. 87-114.

c. Dependence

Item [9] clarifies the relation between the substitution principle and the functional conception of compositionality. [8] is the locus classicus for the claim that compositionality involves a stronger notion of dependence, computability, than mere functional dependence. [11] is an elaboration and defense of the claim that dependence in the principle of compositionality is supervenience. [7] claims that compositionality is the principle that the meanings of complex expressions are “constructed from” the meanings of its parts and presents the principle of reverse compositionality (in the section “Compositionality and the Lexicon”) and [10] forcefully argues against that principle. [6] defends the empirical conception of dependence.

[6] Dowty, D. 2007. “Compositionality as an Empirical Problem.” In C. Barker & P. Jacobson (eds.), Direct Compositionality, Oxford University Press: pp. 23-101.
[7] Fodor, J. & Lepore, E. 2001. “Why Compositionality Won’t Go Away: Reflections on Horwich’s ‘Deflationary’ Theory.” Ratio 14.4: pp. 350-368.
[8] Grandy, R. 1990. “Understanding and the Principle of Compositionality.” Philosophical Perspectives 4: pp. 557-572.
[9] Hodges, W. 2001. “Formal Features of Compositionality.” Journal of Logic, Language and Information 10 (1): pp. 7-28
[10] Johnson, K. 2006. “On the Nature of Reverse Compositionality.” Erkenntnis 64 (1): pp. 37 – 60.
[11] Szabó, Z. 2000. “Compositionality as Supervenience.” Linguistics and Philosophy, 23: pp. 475-505.

d. Novelty

Most papers on compositionality involve some discussion of the argument from novelty. [12] is the first explicit statement of the argument and the catalyst for contemporary discussions of it.

[12] Davidson, D. 2001. “Theories of meaning and learnable languages.” In his Inquiries into Truth and Interpretation. Clarendon Press: pp. 3-16.

e. Systematicity

There are two separate bodies of literature on systematicity. First, there are arguments for and against certain views of cognitive architecture involving a syntactic notion of systematicity. The opening volley is [15]. Item [13] contains a thorough discussion of how to understand this notion of systematicity, and [16] and [17] carefully consider whether natural language is systematic in this sense. The other semantic sense of systematicity and the argument for compositionality based on it can be found in a number of Fodor’s works, including [14] pp. 106-107.

[13] Cummins, R. 1996. “Systematicity.” Journal of Philosophy 93: pp. 591-614.
[14] Fodor, J. 1994. “Concepts: A Potboiler.” Cognition 50: pp. 95-113.
[15] Fodor, J. & Pylyshyn, Z. 1988. “Connectionism and Cognitive Architecture.” Cognition 28: pp. 3-71.
[16] Johnson, K. 2004. “On the Systematicity of Language and Thought.” Journal of Philosophy 101: pp. 111-139.
[17] Pullum, G. & Scholz, B. 2007. “Systematicity and Natural Language Syntax.” Croatian Journal of Philosophy 21: pp. 375-402.

f. Compositionality vs. Theories of Meaning

Frege’s Puzzle originally occurs in [18]. There is a large literature on the puzzle; [23] is one detailed defense of the naïve theory. [19] is one of many examples of arguments against conceptual-role semantics using the principle of compositionality. Michael Dummett developed a sophisticated conceptual-role semantics; [22] is an excellent overview, as well as an argument that Dummett’s semantics too is non-compositional. The Frege-Geach problem appears in [20] and [25]. Hare casts the problem in terms of compositionality in [21]. [24] provides an accessible overview.

[18] Frege, G. 1997. “On Sinn and Bedeutung (1892).” In M. Beaney (ed.), The Frege Reader: pp. 151-171.
[19] Fodor, J. & Lepore E. 1993. “Why Compositionality (Probably) Isn’t Conceptual Role.” Philosophical Issues 3, Science and Knowledge: pp. 15-35.
[20] Geach, P. 1965. “Assertion.” Philosophical Review 74: pp. 449-465.
[21] Hare, R. 1970. “Meaning and Speech Acts.” Philosophical Review 79: pp. 3-24.
[22] Pagin, P. 2009. “Compositionality, Understanding, and Proofs.” Mind 118 (471): pp. 713-737.
[23] Salmon, N. 1986. Frege’s Puzzle. Cambridge: The MIT Press.
[24] Schroeder, M. 2008. “What Is the Frege-Geach Problem?” Philosophy Compass 3/4: pp. 703-720.
[25] Searle, J. 1962. “Meaning and Speech Acts.” Philosophical Review 71: pp. 423-432.

g. Triviality

Item [28] presents the triviality argument considered in this article. Items [7] and [27] are two different attempts at undermining Horwich’s conclusions. A distinct triviality argument is presented in [29]; [26] provides a response. Familiarity with formal logic is required for [29] and [26].

[26] Dever, J. 1999. “Compositionality as Methodology.” Linguistics and Philosophy 22: pp. 311-326.
[27] Heck, R. 2013. “Is Compositionality a Trivial Principle?” Frontiers of Philosophy in China 8 (1): pp. 140-55
[28] Horwich, P. 1997. “The Composition of Meanings.” Philosophical Review 106: pp. 503-532.
[29] Zadrozny, W. 1994. “From Compositional to Systematic Semantics.” Linguistics and Philosophy 17.4: pp. 329-342.

h. Context-Sensitive Expressions

Item [31] is a classic and informs most contemporary work on context-sensitive expressions. [32] is an admirably clear treatment of what the principle of compositionality does and does not say about context-sensitivity. [33] began a debate about “unarticulated constituents”: aspects of meaning that are contextually supplied, but not compositionally derived. [30], [34], and [35] are three different contemporary perspectives in the debate.

[30] Carston, R. 2000. Explicature and Semantics. UCL Working Papers in Linguistics 12.1.
[31] Kaplan, D. 1989. “Demonstratives.” In J. Almog, J. Perry, & H. Wettstein (eds.) Themes from Kaplan: pp. 481–563.
[32] Lasersohn, P. 2012 “Contextualism and Compositionality.” Linguistics and Philosophy, Vol. 35.2: pp. 171-189.
[33] Perry, J. 1986. “Thought without Representation.” Proceedings of the Aristotelian Society, Supplementary Volumes: pp. 137-166.
[34] Recanati, F. 2010. Truth Conditional Pragmatics. Oxford University Press.
[35] Stanley, J. 2002. “Making It Articulated.” Mind & Language 17: pp. 149-168.

i. Idioms

Readers interested in idioms should begin with [36] and follow its bibliography for more references.

[36] Nunberg, G., Sag, I., Wasow, T. 1994. “Idioms.” Language, Vol. 70, No. 3: pp. 491-538.

j. Noun Modification

Noun compounds, possessives, and modification of nouns with color adjectives provide instructive case studies regarding how philosophers, linguists, and psychologists confront apparently non-compositional phenomena. [37] is a classic, accessible source for observation, experiment, and linguistic analysis of noun compounds. [41] defends the thesis that compound [N1 N2] means “N2 somehow related to a N1 or N1s,” and [44] defends a hidden indexical solution. [40] is a good overview of the issues regarding the semantic treatment of possessives. A number of papers by Travis, including [43], have articulated the problem color adjectives present for the compositionality of truth-conditions. [42] presents a hidden indexical solution, and [38] attempts to use more standard resources to solve the problem. The psychological literature on noun modification typically eschews compositional treatments and goes under the heading “conceptual combination.” [39] is a review of the major psychological theories of processing modified nouns.

[37] Downing, P. 1977. “On the Creation and Use of English Compound Nouns.” Language 53.4: pp. 810-842.
[38] Kennedy, C. & McNally, L. 2010. “Color, Context, and Compositionality.” Synthese 174.1: pp. 79-98.
[39] Murphy, G. 2002. “Conceptual Combination.” In his The Big Book of Concepts. The MIT Press: pp. 443-75.
[40] Partee, B. “Chapter 15: Some Puzzles of Predicate Possessives.” In her Compositionality in Formal Semantics: Selected Papers by Barbara Partee. John Wiley & Sons.
[41] Sainsbury, R. 2001. “Two ways to smoke a cigarette.” Ratio 14: pp. 386-406.
[42] Szabó, Z. 2001. “Adjectives in Context.” In I. Kenesei & R Harnish (eds.) Perspectives on Semantics, Pragmatics and Discourse: A Festschrift for Ferenc Kiefer. Amsterdam: John Benjamins: pp. 119-146.
[43] Travis, C. 1997. “Pragmatics.” In B. Hale & C. Wright (eds.) A Companion to the Philosophy of Language. Blackwell: pp. 87-107.
[44] Weiskopf, D. 2007. “Compound Nominals, Context, and Compositionality.” Synthese, 156: pp. 161-204.

k. Additional Problems

There are a number of additional phenomena that have been seen as challenges to the principle of compositionality. Quotation as a problem for the principle of compositionality goes back at least to [18]. [45] presents a unique attempt to give a compositional treatment of quotation. [46] and [48] include treatments of so-called “donkey sentences.” The representations assigned by Kamp’s Discourse Representation Theory ([48] and other work) are unabashedly non-compositional. [47] and [49] involve a challenge for compositionality involving the interaction of ‘unless’ with quantifiers.

[45] Davidson, D. 1968. “On Saying That.” Synthese 19: pp. 130-146.
[46] Heim, I. 1982. The Semantics of Definite and Indefinite Noun Phrases. Ph.D. dissertation. Department of Linguistics. University of Massachusetts, Amherst.
[47] Higginbotham, J. 1986. “Linguistic Theory and Davidson’s Program in Semantics.” In E. Lepore (ed.) The Philosophy of Donald Davidson: Perspectives on Truth and Interpretation. Oxford: Blackwell.
[48] Kamp, H. 1981. “A Theory of Truth and Semantic Representation”. In: J. Groenendijk, T. Janssen & M. Stokhof (eds.) Formal Methods in the Study of Language. Mathematical Centre Tracts 135, Amsterdam: pp. 277-322.
[49] Pelletier, F. “On an Argument against Semantic Compositionality.” In D. Prawits & D. Westerståhl (eds.) Logic and Philosophy of Science in Uppsala. Kluwer: pp. 599-610.

l. Additional References

[50] Bittner, M. 1995. “Quantification in Eskimo: A Challenge for Compositional Semantics.” In E. Bach, E. Jelinek, A. Kratzer, B. Partee (eds.), Quantification in Natural Language. Kluwer: pp. 59–80.

Author Information

Michael Johnson
Email: michael.dracula.johnson@gmail.com
Hong Kong University
Hong Kong

Chinese Philosophy: Overview of Topics

If Chinese philosophy may be said to have begun around 2000 B.C.E., then it represents the longest continuous heritage of philosophical reflection. Trying to mention each philosopher or every significant thinker is not possible. This article is highly selective by choosing philosophers according to two basic principles: (1) Those who are the most representative of the key contributions of China to philosophical topics worldwide, and (2) those who made substantial redirections on a fundamental question of philosophy. Excluded are those who followed the grammar and approach from earlier thinkers, and who engaged more specifically in what might be called internecine debates and refinements.

The positions of the thinkers covered are grouped under the topics of ontology, epistemology, moral theory, and political philosophy. Fundamental questions belonging to these categories show up in Chinese philosophy, just as they do in Western thought. There are questions Chinese thinkers do not ask or do not approach in the same way as Western philosophers, so gaining an appreciation for why Chinese philosophy has sometimes followed a different path from that taken in the West is itself instructive. This overview is designed to pique the interest of readers, encouraging them to pursue the ways in which Chinese thinkers have made significant contributions to topics of interest in world philosophy.

Ontology: Fundamental Questions on the Nature and Composition of Reality
Epistemology: Fundamental Questions on the Nature and Scope of Knowledge
Moral Theory: Fundamental Questions on Morality
Political Philosophy: Fundamental Questions on Society and Government
Epilogue
References and Further Reading

1. Ontology: Fundamental Questions on the Nature and Composition of Reality

Western philosophy often takes the theory of reality (ontology) as equivalent to metaphysics, but this term does not fit for Chinese philosophy as it implies there is something beyond nature that creates and guides reality from the outside. While Chinese philosophical thought has a wide variety of ontologies, it has not stressed metaphysics in the traditional Western sense. Some ontological questions Chinese philosophers have considered are these: What is reality composed of? Is reality a single type of thing (monism), two types of things (dualism, such as minds and bodies; matter and spirit), or many kinds of things (pluralism)? Is reality composed only of transient things in constant change or are there eternal substances that form its content? Is reality actually as it appears to us, or is it something different than what we think it is? Is reality teleological; that is, is it “purposing” or going toward an end? Is the process of reality guided by a mind or intelligence to occur as it does, or does it follow some internal pattern of its own nature, or do humans attach meaning or purpose to a reality that is devoid of any inherent meaning?

a. Formation of the Early Chinese Worldview

In the period from the beginning of the Zhou dynasty (c. 1045 B.C.E.) to the beginning of the Han dynasty (206 B.C.E.), a number of classical Chinese texts were compiled. These are known now as the “Five Classics” (wujing), and they became enshrined as texts in the educational system of China for hundreds of years. In their received form, all of them have composite elements and some may well reflect the concerns and contexts of later (more in the Han dynasty period) rather than earlier (more in the Zhou dynasty) periods. Despite the uncertain dating of many passages and themes, these texts contain a substantial amount of material that is traceable to the pre-Qin (pre-221 B.C.E.) period, even reaching back to Confucius’s era (551-479 B.C.E.) or before.

The ontology of early Chinese thought comes down to us through a number of philosophical texts which are not traceable to any single author. Included among these are: the “Great Commentary” to the Classic of Changes (Yijing), the Chronicles of Zuo (Zuozhuan), and the “Great Plan” (Hong Fan) section of the Classic of History (Shujing).

i. The Classic of Changes (Yijing) and Its Place in Chinese Philosophy

The Classic of Changes (Yijing) is a complete edited work in two parts. One part is a manual of divination known simply as the Changes (Yi), or more correctly, as the Zhouyi. It is a handbook traceable to the period and practices of the Western Zhou dynasty as indicated, among other features, by its use of language expressions found on the bronzes of that period (c. 1046-771 B.C.E.). The other part of the Classic of Changes is a set of seven commentaries. Three of the commentaries are composed of two sections each. Taken as a whole, the commentary of this second part is known as “The Ten Wings” (Shiyi).

One of the commentaries is known as the Great Commentary (Dazhuan). It is arguably the most important text to study for an understanding of early Chinese ontology. The Classic of Changes as a whole is much less valuable for this purpose.

Regrettably, a determinable date for the composition of the Great Commentary cannot be fixed. However, a version of it was discovered as a silk manuscript among the archaeological finds at the Mawangdui tomb site in Changsha in 1973. Therefore, it must be older than 168 B.C.E. when the tomb was closed. The work makes use of the fundamental philosophical vocabulary of Chinese ontology that continues to be used by Chinese thinkers up to the early modern period. It speaks of Heaven (tian) and earth (di) collectively (tiandi) as a way of talking about “reality”. As for the process of reality’s change, it employs the term dao as a nominative and portrays it as operating according patterns (tian wen) or Principle(s) (li). In this commentary, the substance of reality (qi) is capable of transforming into a myriad of experienced objects, evidencing properties of what might be called in the West “matter” or in other forms “spirit.” Qi is moved by pushes and pulls of its internal opposing forces, yin and yang (Great Commentary Part I, 1, 4). Although reality’s changes are not arbitrary, neither are they guided by a mind or divine intellect. The Great Commentary associates the patterns (li) that give order to reality with the hexagrams found in the divination manual (the Zhouyi). The general philosophical term for the process of reality is “correlative ontology”. Various correlations are possible; for example, yin and yang may be mutually supportive, or one may be transforming the other, balancing it, compensating for it, enhancing it, or furthering something new in relation to the other.

In Western philosophy, the characteristic approach to ontology is to think of things that compose reality as “natural kinds,” each of which has a different essence that makes it what it is; for example, the essence of a chair, a cat, a tree, and so forth. This defining essence is typically called “the nature” of the object. In early Chinese ontology, change and process are more fundamental than continuity and endurance, even if there is sufficient constancy to speak of objects through time. The characteristic configuration of qi that something is actualizing (dao-ing) sets it apart from other things. This distinctive correlation of yin and yang does the philosophical work of the Western concept of essence. It enables identification of kinds and categories of things, without recourse to an ontology in which there is a pluralism of essentially different sorts of substances.

Chinese philosophers inheriting the ontology of the Yijing and Great Commentary still use the concept of the “nature” (xing) of something, but “nature” does not refer to some underlying essence or immaterial substance that makes something what it is in distinction from other things. “Nature” is a way of talking about the manner of qi correlation that actualizes a thing as it is and sets it apart from the correlations of other things.

ii. The Chronicles of Zuo (Zuozhuan, c. 389 B.C.E.)

The Chronicles of Zuo is a record of occurrences of the Spring and Autumn Period (771-468 B.C.E.) that traditionally has been ascribed to Zuo Qiuming, a court writer who lived in the State of Lu during the time of Confucius. The text is arranged as comments on the reign of various Marquis and Dukes and it was likely completed no later than 389 B.C.E.

Remarking on the 7^th year of the reign of Duke Wen (626-609 B.C.E.) the Chronicles of Zuo says: “Water, fire, metal, wood, earth, and grains are called the six natural resources (liu fu, or “six treasures”)”. The character fu is used to denote them. This list of six contains the five phasal elements (wuxing) of wood (mu), fire (huo), earth (tu), metal (jin), and water (shui). We see these in later ontological works but with the addition of the grains. The wuxing correlative ontology refers to a conceptual scheme that is found in traditional Chinese thought. Its elements are regarded as dynamic, interdependent modes or aspects of the universe’s ongoing existence and development. All objects of reality are some combination and in interdependent operation of these five. In comments on the 27^th year of the reign of Duke Xiang (590-573 B.C.E.) the text says: “Heaven has given birth to the five materials (wu cai) which supply humankind’s requirements, and the people use them all. Not one of them can be dispensed with.”

iii. The “Great Plan” in the Classic of History (Shujing)

In the “Great Plan” chapter of the Classic of History the compilers are interested in explaining how society should follow the patterns (li) of Heaven and earth. To do so, they provide the reader with information about these patterns, which offers substantial content about the ontology of the period. For example, in speaking of the nine divisions of the “Great Plan” by which Heaven orders reality, the text refers to the five phasal elements that are the building blocks for all real objects (Classic of History, “Great Plan” 2.2). The chapter does not spell out how the interdependencies of these five phases work, it only says they exist. It is made clear, though, that if humans do not behave in the proper manner, they can disrupt the harmonious operation of these phases, illness and weakness will arise in the body, and disorder will show up in nature and the human world of history.

b. Mozi, (fl. 470-391 B.C.E.)

While a study of Mozi’s (Mo Di or Master Mo) moral thought is paramount to understanding Chinese philosophy, his views on ontology, especially as they are set out in Books 8-37 and 46-49 of the Mozi, are sometimes overlooked. An understanding of Mozi’s views on reality begins with what he has to say about Heaven (tian). In classical Chinese, the word tian has many uses. When used as “Heaven and earth,” it is typically a reference to reality or all that is. Tian used alone is a nominative for the sky or a more or less numinous person.

Not surprisingly, then, the Mozi text often describes Heaven as though it is an agent that acts with intentions (yi, zhi) and desires (yu) (for example, in chapters 26-28). Heaven is praised as impartial, generous, wise, and just. It cares for humans and benefits the worthy by providing resources and blessings. Heaven has a dao that orders all things, including its relations with humanity. To use a comparable philosophical concept from the West for Mozi would be to say that Heaven is providential. Moreover, the source of a universal morality that overcomes and corrects human ethical conceptions is Heaven’s will mediated through the ruler.

Holding such a view is one of the reasons why Mozi is committed to a rejection of the philosophical position that the happenings in the course of reality’s process are predestined or fated (ming). Mozi’s arguments on this subject are gathered in the “Against Fate” chapters (35-37) of his text. A principal argument used by Mozi against the position that reality is fated is a pragmatic one. He holds that accepting such a position would mean that one’s status, health, wealth, success, and longevity are already determined and not consequences of one’s effort or choices in life. Taking this view would lead to disaster (37.10). In fact, Mozi says the concept of fate should be regarded as a creation of evil kings and peasant farmers. His point is that some kings used this philosophical idea as a means to justify their positions of power and wealth, while the peasants used it to explain why their reduced situation in life was not a result of living wrongly, or failing to better themselves; that is, it was fated that they be poor. This explains in part why Mozi considered the ontological concept of ming (fate) to be one a philosopher must reject.

c. Lao-Zhuang Daoist Ontology (c. 350-139 B.C.E.)

To speak collectively of “Lao-Zhuang” tradition is to identify a set of philosophical sentiments and positions in common between the two classical works of emergent Daoism in Chinese intellectual history: the Daodejing (DDJ) and the Zhuangzi (ZZ). Both the Daodejing associated with Laozi and the Zhuangzi ascribed to Zhuang Zhou (369-289 B.C.E.) are composite works not written by a single author. Throughout the classical period, there were many lineages of teachers and disciples, as well as multiple oral and written versions of transmitted materials that came together to form these texts. There was no unified, coherent school called Daoism in the classical period, but the term Lao-Zhuang can be used to capture the family resemblances between lineages and their transmitted teachings.

We have already noticed in our survey of the earliest Chinese ontologies that reality (that is, “Heaven and earth”) is a constant process, but the changes are not haphazard. The Chinese term used to capture the order reality exhibits is dao, which literally means the ‘way’ or ‘path’ that the changing process of reality displays. In this process, there are patterns and principles that are evident to one who reflects on the dao. The dao of qi (the energy which composes all things) gives rise to itself and to forces that move it. It is self-moving, according to the dynamic energies of yin and yang.

The term dao is one of the most important concepts in the Daodejing. Sometimes it is used as a noun (“the Dao”) and other times as a verb (“dao-ing”). According to the Daodejing, the dao has a power in itself from which all things have come (DDJ 42). There is a confidence expressed in the text that the process of the dao of reality is at a minimum benign (DDJ, 37 says dao leaves nothing undone). In fact, it is untangling knots that humans create, as well as blunting the sharp edges constructed by those who are resisting or moving contrary to dao. There is a very close association of dao with Heaven (tian) that benefits and does not harm (DDJ 73, 77, 81). When we look closely at the Daodejing’s remarks about Heaven they make it clear that a critical move is made in Chinese ontology by thinkers in this tradition. Heaven’s dao is life-furthering and full of benefit, but is are without deliberation or plan. Still, unlike the Mozi’s Heaven, dao has no mind: It is not planning or working by a design toward a goal it is trying to reach. It is acting spontaneously, but neither is it leaving loose ends or causing problems, disorder, or confusion.

In the sections of the Zhuangzi anthology that come from the master teacher, Zhuang Zhou, these matters are expressed in a very literary way. For dao the text often uses “The Great Clod” by which all things come into being (ZZ Ch. 2). But when using dao, the Zhuangzi says it lacks form but is its own root, and it gave birth to Heaven and earth and all things (ZZ Ch. 6).

The point being made in both the Daodejing and Zhuangzi is the dao is beyond language and cognitive categories of space and time. It is not in any space nor has it any temporal description. As such dao functions as what philosophers call a “limiting concept”. Asking when dao began serves no purpose because time does not apply to it; neither does speculating about where it exists because it is not in any particular place.

The Zhuangzi does not make any specific reference to the five phasal elements ontology used in the “Great Plan” probably because it was in development at the same time that Zhuangzi text was being formed. It makes clear, however, that all things are changing and being transformed, and that people can have some involvement in their own transformations (ZZ Ch. 6, 7).

d. Correlative Cosmologies in the Han Period: Yinyang and Wuxing Heuristics

According to Sima Tan (d. 110 B.C.E.), during the Spring and Autumn and Warring States (403-221 B.C.E.) periods a school existed that bore the name yinyang. He lists this yinyang school alongside others such as the Confucian, Mohist, Legalist, and Daoist. According to him, this school focused on divination and explored the patterns of Heaven and earth. This school almost certainly had its antecedents in the Zhouyi and was likely a theoretical and heuristic extension of many of the practices associated with that text.

By the Han dynasty (202 B.C.E.-220 C.E.), yinyang thought was associated with the standardization of wuxing (the five phasal elements) correlative cosmology associated with the work of Zou Yan (c. 305-240 B.C.E.). The synthesis of Confucianism, yinyang, and wuxing explanatory philosophies is evident in the writings of the scholar Dong Zhongshu (179-104 B.C.E.) and exhibited in his volume, Luxuriant Dew of the Spring and Autumn Annals (Chunqiu fanlu). The Masters of Huainan (Huainanzi) is also a primary representative text for correlative cosmology. Large sections of Chapters 2, 3, 7, and 20 depend heavily on this ontology for the cogency of the work’s argument about Heaven’s relation to human activity. Masters of Huainan, however, tends to blend Daoist sensibilities (especially Yellow-Emperor Daoist ideas) with yinyang and wuxing more prominently than did Dong Zhongshu’s work.

e. Selected Buddhist Ontologies

Scholars have debated two interpretations of how Buddhist missionaries first reached China in its southern regions: first, through maritime landings that spread up the Chang Jiang (Yangtze River) and the Huai waterway into the area of present day Jiangsu province under Prince Ying of Chu (c. 65 C.E.); and second, by moving overland along the northern Silk Road through the areas controlled by the Yuezhi central Asian peoples in what is now Xinjiang province and western Gansu province. The latter interpretation continues to have the greatest preponderance of evidence in its favor, along with long-standing traditions that the White Horse Temple in the Han capital of Luoyang (present day Henan province) was the first temple in China (c. 68 C.E.). However, it seems clear that Buddhism came into China by both routes.

China did not escape the diversity of Buddhist Madhyamika philosophical schools; many scholars have argued convincingly that Chinese thinkers did not realize for decades that the Buddhist texts coming from India represented different schools of thought and so they tried unsuccessfully to harmonize them into a single philosophical system. Gradually, Chinese thinkers created some distinctively Chinese approaches to and versions of the Buddhist schools and even began some schools that were indigenous to China.

i. Tiantai Buddhism (Zhiyi, 538-597 C.E.)

Unlike earlier schools of Chinese Buddhism, the Tiantai School was largely of Chinese origin. Tiantai flourished under its fourth patriarch, Zhiyi, who asserted that the Lotus Sutra (Fahua jing) contained the supreme teaching of Buddhism. The school derives its name from the Tiantai mountain that served as its base. The most distinctive ontological claim of Tiantai is that there is only one reality that is both the phenomenal existence of everyday experience and nirvana itself. This is a significant divergence from many early Buddhist teachings in India that drew a sharp demarcation between the phenomenal world and the world of nirvana. In Tiantai, there is not only one reality but also it is ultimately empty. The reason all things are empty is that literally every object and real thing (that is, every dharma) exists as it is through an indefinite number of interdependent causes. Nothing has its own nature or essence that underlies or exists apart from the interplay of all these causes. Accordingly, all things have only tentative existence, and they are impermanent.

Humans experience phenomenal reality as various forms of pain and suffering, happiness and contentment, and may also realize overwhelming enlightenment and peace. In fact, Tiantai writings describe ten ways of existing in reality, but these do not reflect any interest in the kinds of extrapolations offered in the other Chinese ontologies, such as dao, yin and yang, or the elaborate five phasal elements system.

The Ten Ways of Existing in Reality According to Tiantai Buddhism1. Hell Beings2. Hungry Ghosts3. Beasts (non-human animals)4. Asuras (demons)5. Human Beings

6. Gods or celestial creatures

7. Voice-hearers (Skravakas)

8. Self-enlightened Ones (Pratyekabuddhas)

9. Bodhisattvas

10. Buddhas

In Tiantai ontology, the reality that Hell Beings inhabit is the same reality in which the Buddhas live. There is no supernatural boundary between these ways of existing; nor are there opposing spiritual realms such as Heaven and Hell. Living and working next to us may be one who is a Hell Being, or a Bodhisattva, or even a Buddha. The goal is not to depart this world and go into some other transcendent reality. It is to exist as a Buddha in this world. There is no other reality except this one; reality is one.

In Tiantai, every human has the capacity to live in reality as a Buddha. Living as such does not make one eternal; every existing thing will be extinct in the form in which it now exists. This is a reflection of the empty nature of reality; the only reality that there is. At the same time, Tiantai does not deny physical reality; it is no Idealism. Rather, it is a form of ontological Realism, confident that manifold concrete yet fundamentally empty things exist, but they may realize sublimity in this life.

ii. Consciousness-only Buddhism (Weishi Zong)

The version of Chinese Buddhism known as Consciousness-only was called Yogacara in India. The monk Xuanzang (c. 602–664), born Chen Hui, was principally responsible for its popularization in China through his translations of texts he brought from India. His travels there are recorded in detail in the classic Chinese text Great Tang Records on the Western Regions, which in turn provided the inspiration for the imaginative spiritual journey novel Journey to the West, written by Wu Cheng’en during the Ming dynasty (1368-1644), around nine centuries after Xuanzang’s death.

The central ontological tenet of Consciousness-only Buddhism is that nothing exists except consciousness. Reality is the flow of experiences and awareness of ideas is called perception. Perceptions are not caused by things external to humans such as concrete or material objects that continue whether humans are conscious of them or not. In ontological language, this is called Idealism, which contrasts with the Realism of Tiantai. In its original context in India, the Consciousness-only teachings were direct contradictions to the prevailing Indian physics of reality that all things (dharmas) are constructed from the atoms of earth, water, fire and air. It also stood in radical contrast to Chinese thought about qi and the five phases.

In Consciousness-only teaching, when a person is born, thereby becoming conscious, individual experience is not funded by encounters with objects in an external world but by something Xuanzang called the “storehouse consciousness”. Every deed that has ever been done and every idea that anyone has had is contained in this consciousness. No dharma (experienced idea) exists by itself, and any alteration in the way other ideas cause it to exist would be a different experience entirely. This is what is meant by the concept of “dependent co-arising” in Consciousness-only philosophy.

Still, not all consciousness is of the same level of development; some forms are higher than others. As the levels of consciousness advance, they “perfume” the highest level of consciousness into being. “Perfuming” in this philosophy is a unique ontological approach to causality, quite different from Aristotle’s discussions of cause and John Stuart Mill’s remarks on the determination of cause.

f. The Neo-Confucian Synthesis of Zhu Xi (1130-1200)

Beginning in the early 11^th century, a group of interdependent philosophers began to reconstruct Chinese philosophy by using a new grammar. They sought to merge Confucian thought with Daoist and Buddhist concepts. While they surely thought of themselves as Confucian, and valorized Confucius and Mencius (c. 372-289 B.C.E.) in their writings, it is clear that they were doing something novel with their appropriation of classical Confucian ideas. Accordingly, they are grouped together as “Neo-Confucians”. This family of thought included philosophers such as Zhang Zai (1020-1077), Cheng Hao (1032-1085), and Cheng Yi (1033-1107).

Without doubt, Zhu Xi is the most influential of these thinkers. His philosophy set the parameters of philosophical conversation on ontology throughout East Asia for over 400 years. Western philosophers of the same stature would include Aristotle in the Classical period, Thomas Aquinas in the Medieval period, and Immanuel Kant in the Enlightenment period. Zhu Xi’s systematization of the Confucian Way (dao) also became a coherent program of education for centuries in China, Korea, and Japan.

Zhu Xi’s extensive philosophical work rests on the foundation of his theory of reality. The place to begin understanding his ontology is in Xi’s following statement: “Everything that has shape and form is “concrete existence” qi. That which constitutes the Principle(s) (li) of “concrete existence” is the Way (dao)” (Collected Writings of Chu Hsi 36.14).

Several philosophical questions arise in Zhu Xi’s ontology. Did he think of Principle(s) as singular or plural? What should be included in Principle(s) when he uses this as an ontological concept? Does Principle(s) refer to something like the logical scaffolding of reality (that is, its design, order, logical structure, or pattern)? Does Zhu Xi use Principle(s) to mean something like the natural laws discoverable by chemistry, physics, and the like? Are Principle(s) in Zhu’s ontology similar to what Kant called the “categories of the mind” (causality, space, time, and so forth). Does Principle(s) sometimes mean “moral principles or norms” that are universally binding and true for all persons? Zhu Xi sometimes uses Principle(s) in one of these senses and sometimes in another. It is not possible to reduce his remarks on Principle(s) to any one of these exclusively. Likewise, the term is sometimes used in a singular and sometimes as a plural in his writings.

For Zhu Xi, the Principle(s) of reality reside in the Supreme Ultimate. But this is not a thing or a being. Rather, before shapes and things began to exist, the Supreme Ultimate from which they came had the principles of shape and order, but was not itself any shape or form. Neither is it a “blank” (wu). It cannot be said to exist (yu) as one thing alongside others. It existed before Heaven and earth. Although the noted scholar Feng Youlan takes Zhu Xi’s discussion of Principle(s) to be a version of what Plato called the Forms (see his A Short History of Chinese Philosophy), such a reading is arguable. It is not as though a brick is an expression of the Platonic Form of a brick. Rather, a brick is the result of a specific five-phase configuration ‘bricking’ (as a verb of action) according to Principle(s) that are universally shared by all things. The Supreme Ultimate is a concept used for talking collectively about the Principle(s) governing the five phases and yin and yang. On this reading, Principle(s) enable concrete configurations of qi to yield the myriad things that furnish reality.

Zhu Xi’s ontology may be considered a form of Naturalism, rather than Theism. The Supreme Ultimate is not God in the Western sense or Plato’s Form of the Good. However, neither is it reducible to or the product of the other cosmological operators in Zhu’s thought such as qi, yin, yang, or the five phases.

g. Wang Yangming (1472-1529)

The principal sources for Wang Yangming’s ideas are his Instructions for Practical Living (Chuan Xilu, 1518) and “Inquiry on the Great Learning” (1527). The latter work offers a succinct summary of the main themes he developed throughout his life.

Wang is often understood to be an ontological Idealist. But he makes it clear that he is not an Idealist in a famous story where he points out to a friend the flowering trees on a cliff. The friend assumes that Wang’s position is a form of Idealism. He then challenges Wang by claiming the flowers are independent from his mind. Wang’s reply makes his ontology clear. He says that before the friend looked at the flowering trees, they were simply there in their vacancy, but when the friend experiences them, he thinks of them as a tree, a cliff, and flowers. Thus, as the experienced “world” they are not at all independent of his friend’s mind. They cannot be “flowers on a cliff” without the mind.

Why is this? For Wang, the reason is very clear. It is because human minds are inherently patterning. Known as the Human minds Principle (li), this patterning that makes things as they are into a universe or reality. Otherwise, there are only concrete things (qi) moving around; there is really no “world”. So, Wang is not denying the existence of concrete things as in Idealism but he is insisting that these things are not without the patterning that the mind brings to experience.

When human minds do this patterning it is not always a conscious or deliberative process. Likewise, individuals also do not “know” the Principles by which they engage in the process. Rather, in the most truthful experiences, human minds are one with Heaven and earth, and the Principles are applied directly by “pure intelligence” (liangzhi), not through the mediation of data from the five senses, or by discursive reason, or the authority of any book or philosophical teacher.

There is a fundamental difference, though, between Wang’s position and that of Zhu Xi. Wang does not set Principle(s) in a transcendent sense apart from concrete things. In fact, he gives them no existence apart from the human mind. If there were no human mind, there would be no “world.”

h. Shifting Paradigms in Chinese Ontology

i. Dai Zhen (1723-1777)

Dai Zhen’s two most prominent philosophical works are entitled On the Good (Yuanshan) and An Evidential Study of the Meaning and Terms of the Mencius (Mengzi ziyi shuzheng). Some interpreters hold that Dai Zhen was responsible for a major paradigm shift in Chinese thinking on ontology. He completely removed the transcendent aspect from Principle(s) (li), and this is certainly a shift from Zhu’s understanding. Furthermore, Dai did not think that Principle(s) were independent of concrete things as Zhu did, but neither did he think they were an activity of the human mind as Wang believed. Instead, he conceived of Principle(s) as the internal order (tiao) or pattern (wen) of things-in-themselves.

To use Western philosophical terms, Dai’s thinking is as a form of teleological naturalism. Purpose, pattern, and design are not imposed on reality by human beings, but neither do they derive from a transcendent realm that is wholly other than the natural process itself. Instead, they are a part of the very nature of the stuff of reality itself.

Some interpreters of Dai characterize his position by means of a rather distinctive Chinese example. A method used to determine the authenticity of a piece of jade in China is to hold it up to the light and observe whether veins can be seen in its translucence. If so, the jade is authentic. If not, it is an imitation and a fake. Accordingly, Dai may be interpreted to be saying that concrete objects have such analogous striations and these are the Principle(s) that give order to reality.

ii. Hu Shi (1891-1962)

Hu Shi was a key figure in the New Culture Movement that introduced ideas from the West to China. This movement developed the slogans “Mr. Science” and “Mr. Democracy” to describe Western learning (Xi xue). Hu specifically acknowledged the influence of Thomas Huxley and John Dewey on his thought, and he was a contemporary with some of the most prominent Western philosophers, including Ludwig Wittgenstein and Martin Heidegger. He has been called the central figure in 20^th century Chinese academic thought.

Hu studied in a Western-style system in Shanghai, being particularly impressed by the Darwinian theory of evolution. Later, he studied in America at Cornell and Columbia University, where John Dewey became his dissertation supervisor. While still a young student in Shanghai, he summarized the changes in his conception of life in the universe from the Chinese ontology with which he was raised. Published in 1923, he entitled this summary the “New Credo”. Its includes the following points:

On the basis of knowledge of astronomy and physics, people should recognize that the world of space is infinitely large.
On the basis of geological and paleontological knowledge, people should recognize that the universe extends over infinite time.
On the basis of all verifiable scientific knowledge, people should recognize that the universe and everything in it follow natural laws of movements in change. So, what is “natural” is the Chinese sense of “being so of its self” and there is no need for the concept of a supernatural Ruler or Creator.
On the basis of the biological sciences, people should recognize the terrific wastefulness and brutality in the struggle for existence in the biological world and consequently the untenability of the hypothesis of a benevolent Ruler.
On the basis of the biological, physiological, and psychological sciences, people should recognize that man is only one species in the animal kingdom that differs from the other species only in degree, but not in kind.
On the basis of the knowledge derived from anthropology, sociology, and the biological sciences, people should understand the history and causes of the evolution of living organisms and of human society.
On the basis of the biological and psychological sciences, people should recognize that all psychological phenomena could be explained through the law of causality.
On the basis of biological and historical knowledge, people should recognize that morality and religion are subject to change and that the causes of such change can be scientifically studied.
On the basis of newer knowledge of physics and chemistry, people should recognize that matter is full of motion and not static.
On the basis of biological, sociological, and historical knowledge, people should recognize that the individual self is subject to death and decay. But the sum total of individual achievement, for better or for worse, lives on in the immortality of the Larger Self. That to live for the sake of the species and posterity is religion of the highest kind; and that those religions that seek a future life either in Heaven or in the Pure Land are selfish religions.

Hu Shi calls this credo “The Naturalistic Conception of Life and the Universe”. This work, which he saw as a turn from Chinese philosophy leading up to the 20^th century, illustrates his commitment to the experimental sciences. He continued to embrace this credo throughout his life.

2. Epistemology: Fundamental Questions on the Nature and Scope of Knowledge

Some epistemological questions are these: What is it “to know”? Can we know something to be true, or do we only believe things to be true (skepticism)? Are all knowledge claims of the same sort? Are they justified in the same way? What are the tools we use to know something (reason, senses, direct apprehension, and so forth)? Do we possess innate knowledge? Is there a limit to what we can know?

a. The Mozi, Later Mohists and Debaters (bianshi)

In his rejection of the commonly held belief that reality is fated (ming), Mozi’s students asked him to set out the philosophical bases for knowing how to judge between views. In general, the response he makes to this question serves as a reasonable outline for his theory of how to establish a claim’s truth. He insists that knowledge must be pursued by means of three criteria (Mozi 35.5). Mozi’s first test for judging between knowledge claims is what we may call an examination of the received belief about the claim. This is understood as what the historical records report. The second truth test is what Mozi calls “the evidence of the eyes and ears of the common people”. He takes this to mean direct experiential testimony to the truth of a claim. His third test for determining truth is that the truth of a claim rests on observing whether acting on the claim yields the expected results, which should obtain if it is true.

Applying these three criteria leads Mozi to accept the claim that ghosts and spirits exist. He argues that received knowledge includes the intervention and existence of spirits as explanatory devices and that there is widespread testimony to the presence of such phenomena. Most importantly, however, Mozi feels that the pragmatic implications of giving up such a belief would be disastrous; cruelty, robbery, and warfare, for Mozi, are common precisely because people have come to doubt whether ghosts and spirits exist or not. He says, “If all the people of the world could be brought to believe that ghosts and spirits are able to reward the worthy and punish the wicked, then how could the world be in disorder?” (Mozi 31.1)

Mozi’s students, and their students, developed his interest in how we know something to be true in the years following his life. In Records of the Grand Historian, Sima Tan (d. 110 B.C.E.) identified a group of thinkers he called Mingjia (名家, School of Names). These thinkers have been variously classified as debaters, rhetoricians, dialecticians, logicians, and skeptics. In the Warring States Period (c. 475-221 B.C.E.), however, the name used more generally for thinkers occupied with such epistemological questions was bianshi 辯士 (often rendered as “disputers” or “rhetoricians”). The approaches and arguments of the bianshi can be associated with the work of the so-called Later Mohist philosophers. We know this group of thinkers largely through the final six chapters of the Mozi text (Chapters 40-45), which form an entirely different unit than the earlier sections of the work.

Outside the Mozi text, the ideas of two of the bianshi are known to us through sources which we may have some degree of confidence: Hui Shi (307?-210? B.C.E.) and Gongsun Longzi (b. 380? B.C.E.). Hui Shi shows up in nine chapters of the Zhuangzi. The text Gongsun Longzi is attributed to Gongsun Long. (For an English translation see Mei Yi-Pao (1953)). It is nearly certain that later bianshi would not have accepted Mozi’s position that ghost and spirits exist.

b. Lao-Zhuang Traditions on Knowing and Truth

There is much in the Lao-Zhuang tradition that seems to suggest anti-intellectualism and anti-rationalism. It is said that sages make sure the people are without knowledge (DDJ 3), those who pursue the dao are cautioned to abandon learning (DDJ 20), states are said to be difficult to rule because the people “know too much” (DDJ 65), and the knowledgeable are contrasted unfavorably with the enlightened (DDJ 33). Moreover, wuwei as a distinctive form of conduct is a teaching without words (DDJ 43) and comes through an experience of numinal vision and confirmation (DDJ 10). However, these passages do not set out a form of anti-intellectualism. Interpreted in their context, they are part of the Lao-Zhuang insistence that the distinctions and concepts by which reason works are of human design, which may mislead people about the nature of reality or tangle them in problems they create themselves. Reason, evidence and argument have their place, but they do not extend to the fullness of freedom and happiness achieved in following the dao.

The Zhuangzi, too, seems opposed to critical inquiry and application of reason and logic. People are cautioned not to wear out their brains with distinctions (ZZ ch. 2). The text uses many examples to point out that what a person thinks he knows is really relative to context and not absolute, and what a person knows is nothing compared to what he does not know (ZZ ch. 17). People are warned that skillfulness in argument culminating in “winning” the point is not equivalent to arriving at truth (ZZ ch. 2). Rhetoricians and logicians are compared to nimble monkeys and rat-catching dogs (ZZ ch. 12). They are skillful at rational gymnastics, but poor at realizing truth. Instead, truth comes through stillness, emptying oneself of rational and human distinctions (that is, naturalness) and direct receptivity of the presence of dao (ZZ ch. 21). For all the affection and friendship between them, Zhuangzi did not approve of the bianshi thinker Hui Shi’s approach to knowledge.

In this tradition, the power to master life and the ability to control one’s transformation is not an achievement of reason. This is not the same as saying that the Lao-Zhuang teachers had no use for reason and sense evidence. Truth comes from oneness with dao. When realized, one flows in life spontaneously and effortlessly, without thought, just like the famous butcher of the Inner Chapters, Cook Ding, who cuts up an ox without ever hitting a bone or dulling his knife (ZZ ch. 3).

c. Mencius (Mengzi, c. 372-289 B.C.E.) and Analogical Reasoning

Although it is often said that classical Chinese philosophers did not place a premium on argumentation, Mencius was a master of the use and criticism of analogical argument. This was the most prevalent method of approaching knowledge and establishing truth among 4^th century B.C.E. Chinese thinkers. Mencius often used this method in his criticisms of other philosophers such as Mozi, Gaozi, and Yangzi.

Analogical reasoning in this period included both the use of one thing to throw light on another and the use of one proposition known to be true to throw light on another of similar form, the truth of which was undetermined. Two advantages of this form of argument in the classical period have been identified. One is that an analogy is often as valuable epistemologically when it breaks down as when it works. The second is that analogy is often the only tool available for exploring a subject that is obscure or one that eludes direct experience. Mencius and his interlocutors carry on their debates in the Mengzi largely through the method of analogy.

One example of Mencius’s use of analogy is his famous exchange with Gaozi recorded in the Mengzi 6A2. In this passage, Gaozi criticizes Mencius’s view that human nature has an inborn tendency to seek goodness by saying that human nature is like water; it will seek whatever outlet is available, showing no preference for flowing East or West. While accepting the analogy between human nature and water, Mencius reminds Gaozi that although water does not prefer East to West, it most surely has the nature to flow downhill, rather than uphill. Likewise, Mencius concludes, human nature has the propensity to move toward the good, just as water seeks downhill.

d. Xunzi (310-220 B.C.E.): Dispelling Obsessions

According to Sima Qian, Xunzi was once the leader of the Jixia Academy, a site where thinkers of the 100 schools (baijia) were represented. Xunzi made skillful appeals to both empirical and rational sources as necessary for arriving at knowledge. Yet, he held that discursive reason could not resolve quandaries if it excluded feeling and emotion, appealing to xin (heart-mind) as an arbiter of truth whenever it operated in a clear state (da qingming), setting aside presuppositions and amok emotion. To prevent such confusions in understanding, Xunzi turned to the concept of fa, meaning “criterion” or “standard”. He held that reasoning, whether analytically making distinctions or synthesizing diverse positions, operates by rules that approximate the way in which a geometer might judge a circle by using a compass. To know something is to be guided by these standards of reasoning to a conclusion. In Chapters 21and 22 of Xunzi, he says the heart-mind draws distinctions among reasons, explanations, and desires similarly to how the eye draws distinctions among colors. Xunzi insists that we never cease learning and investigating. It is just such cumulative knowledge that can save us from obsessions and superstitions, leading us to focus instead on activities that will create a more humane world.

e. Wang Chong (c. 25-100 C.E.): Critical Chinese Philosophy in the Classical Period

Wang Chong was a critic of many received views on ontology, morality, religion, and politics. His writings on these subjects were compiled into the work entitled Critical Essays (Lunheng). It reveals Wang’s critical and somewhat skeptical mind at work, and also his flair for originality in approaching philosophical problems at the end of the period of classical Chinese philosophy. Speaking of his own work, he says, “Although the chapters of my Critical Essays may only number in the tens, one phrase likewise covers them all, namely, ‘hatred of fictions and falsehoods’” (Critical Essays ch. 61).

Wang is keenly aware of the tensions between empirical and rational pursuits of truth, and he insists both must play a role in the advance of knowledge. One cannot depend only on experience because it can be deceptive; thus reason (xinyi) must be involved. He says bluntly that the Mohists did not use their minds to verify things, but indiscriminately believed what common people reported to have experienced. Thereby, the Mohists fell into deception (ch. 67). Moreover, against the Daoists he holds that history never affords any instances of men knowing what is true without inquiry and reasoning (ch. 2). However, Wang is also aware that one could make a coherent set of premises into a logical argument that nevertheless would contradict ordinary and uniform experience and thus be untrue.

In his practice of testing differing positions and claims, Wang often uses the method known in Chinese as “arguing from a lodging place”. This is similar to the strategy of assuming an opponent’s position “for the sake of argument”. Wang believes that by adopting this tactic he can most easily reveal the logical flaws or evidential weaknesses of a position he thinks is false. He frequently makes use of the reductio ad absurdum technique; that is, he shows that an untenable or absurd result follows from accepting the belief in question. His skillfulness in seeing the limitations of both reason and experience as sources for claims he considers weak is one explanation for why early sources, perhaps as a way of ridiculing Wang, wrongly grouped him with the qingtan (“pure talk”) masters, who were skilled rhetoricians and said to be more intent on making arguments rather than gaining truth.

Wang does not believe that all questions can be answered because he insists that one cannot find the truth on the basis of partial evidence alone. Here his approach brings into light the distinction between belief and truth. Many more things can be believed than can be known. Believing something is not “knowing” it. Wang’s use of the term xu as “false” refers to a belief of a certain type. He held that claims shown to be false do not attract us. No one knowingly believes a falsehood. But xu beliefs have not been conclusively falsified and they have attractive features, such as making us feel better about life events, or ourselves, and thus they are difficult to give up believing. Wang’s way of understanding xu helps us to make sense of passages in which he talks as if xu beliefs possess an attractiveness that entices the undisciplined mind.

Wang has no patience with what he considers to be the superstitions of his day, and he does not hesitate to criticize his predecessors, including Confucius, Mozi, Mencius, and those thinkers involved in trying to create a synthesis with the five-phase cosmology and its related belief systems. He uses argument and empirical evidence to criticize the worship of Confucius, to debunk belief in omens, to discount any evidential basis for fengshui, and to show the contradictions in a belief in ghosts and spirits. Wang argues that Heaven (tian) is merely a name for natural physical processes, which are not powers to be assuaged by ritual or prayer. Rather, they are processes to be studied through observation and reason.

f. Tiantai Buddhism’s Threefold Truth Epistemology

The defining thesis of Tiantai is actually epistemological. As advanced by the philosopher Zhiyi, it is the teaching of Threefold Truth (san di), which includes the following points. 1) We can make true statements about the world of ordinary objects. These truths are about things that exist and their interactions in a network of interdependent causes. These are the truths of history, science, and so forth, about provisional existence. 2) It is also true to say that all things are empty (kong di) and have no permanence. Everything in reality is devoid of any self-nature. Of course, it is the realization of this truth that liberates one from suffering because it breaks one’s attachment to things and persons who are the objects of our desires. 3) The mundane or phenomenal world is real, but it is also impermanent and ultimately empty. This is truth as the Middle Way (zhong di).

Zhiyi thought that persons had varying epistemological capabilities, which put them on different levels of knowledge. Some people are only able to grasp truth in its mundane expression. For them, truth enables engagement with the world and its pleasures, desires, and attachments. They suffer because of this, although they may resist desires through moral action, prayer, devotion and the like. Conversely, others express truth as per the Threefold teaching; that is, as emptiness. They detach from the mundane, living apart from it as much as possible. But for those who are capable of it, truth is seen for what it is, and yet they live in the mundane, knowing it is real; but also seeing its emptiness.

g. Wang Yangming on liangzhi: Direct, Clear, Universal Knowledge

Wang Yangming wrote, “What I mean by the investigation of things (gewu) and the extension of knowledge is to apply the pure knowledge (zhi liangzhi) of my mind to each and every thing.” According to Wang, even ordinary knowledge gained by the use of reason requires the direct and clear apprehension of Principle(s) (li) innate to human minds. However, there is also knowledge that cannot be acquired or transmitted by discursive reasoning. He once said to a disciple, “Knowledge acquired through personal realization is different from that acquired through listening to discussion. When I first lectured on the subject, I knew you took it lightly and were not interested. However, when one goes further and realizes this essential and wonderful thing personally to its depth, he will see that it becomes different every day [i.e., in its guiding power] and it is inexhaustible” (Instructions, sec. 11).

According to Wang Yangming’s biography, while exiled in the Guizhou region he experienced a kind of direct enlightenment or pure knowledge (liangzhi) after which he began teaching what he called “the unity of knowledge and action” (zhixing heyi). In liangzhi, one is impelled to act in a certain way. Following this, the person can be said to possess the knowledge of how to act. But there are not two events, one volitional and the other epistemological. The acting is the knowing.

In 20^th century Western philosophy, British thinkers wrote deliberately about the distinction between “knowing that” and “knowing how”. On these terms, Wang Yangming’s notion about the knowledge gained in liangzhi is a third concept, one that has affinities with both epistemology and action. Liangzhi is not a “faculty” of the mind or a special kind of “sense”. Nor is liangzhi the sort of knowledge by which one knows where to dig a well or when to plant crops. One cannot know everything by liangzhi, for example, whether there is evidence of water on Mars. Yet, Wang says that when our heart-mind is operating by liangzhi, a person is moved irresistibly to act freely from all obstruction caused by desires; and within acting lies knowing what to do.

h. Hu Shi (1891-1962): Pragmatism and Experimentalism

When John Dewey arrived in Shanghai on May 1, 1919, the story of Western philosophy’s impact on Chinese thought turned a new page; American Pragmatism’s influence on Chinese intellectual history had begun. Hu Shi claimed that no Western scholar up to that time had exerted the magnitude of Dewey’s influence.

Hu Shi’s contribution to epistemology in Chinese philosophy seems based largely on his adaptation of Dewey’s pragmatism, which Hu preferred to call “experimentalism”. Hu follows Dewey in thinking that the function of the concept of “truth” in the theory of knowledge is instrumental. This means that Hu Shi’s view of truth can be set apart from some other epistemological approaches. He does not think a claim is true if it corresponds to the way the world is; that is, if the claim expresses what humans see, feel, hear, and so forth. Rather, he thinks that saying a claim is true means that the claim may be employed as an instrument to deal with the environment and context of everyday life. True beliefs enable people to deal with life situations effectively and consistently. This means that as life realities change, so might the claims that are “true”. Thus truth is not a minted coin that never changes. He specifically uses this approach to free himself from the views of ancient Chinese sages and their writings, which he feels should be studied largely as historical artifacts and much less so as viable philosophical options.

However, Hu Shi’s view of truth, like Dewey’s, is no mere subjectivism. Instead of truth being something that is relative to the individual, Hu argues that a claim that something is true requires that it be demonstrated experimentally. He has, however, a very broad view of what counts as an experimental demonstration of a claim. By “experimental” he means demonstration according to the scientific method of experiment and confirmation. Yet, he regards this method as only one way of establishing a claim’s effects when it is true or disconfirming the claim when it is false. This way of proceeding has specific implications for his social theory. He thinks that claims being made about economics, politics, morality, and the social sciences can and should be confirmed experimentally by observing whether the observable outcomes of the claim’s being true or false can be confirmed in actual practice.

In the context of Chinese epistemologies, Hu stands out as opposing all kinds of authoritarianism and dogmatism; simply because Confucius or Zhu Xi or some other figure says something, it does not make it true in the current context.

i. Zhang Dongsun (1886-1973): Pluralistic Cultural Epistemology

In the early half of the 20^th century, Zhang Dongsun was one of the most important philosophers in China, especially owing to his efforts to establish, in dialogue with Western philosophy, a unique philosophical epistemology in the Chinese context. This approach has variously been labeled as Pluralistic Epistemology or Cultural Epistemology.

For Zhang, what counts as evidence, what we seek to know, what we think it is possible to know, what we notice through our senses, how we interpret our sense perceptions, and what qualifies as a sufficient reason to say we know something all represent epistemological positions that are inevitably culturally defined and structured. Persons are not merely acculturated to observe festivals, organize themselves socially, or valorize certain heroes. They are also shaped by their cultures to operate epistemologically in different ways.

The most obvious way in which all epistemology can be shown to be cultural is that knowledge is expressed in a particular language. Of course, language is a cultural product. Languages have grammar and structure, and these embody logic and rules for reasoning. For example, Zhang argued that the structure of Western languages leads philosophers to look for the substance underlying the attributes predicated of an object. So, the investigation of the nature of substance itself became one of the central problems of Western philosophy but it did not arise in Chinese philosophy, because the language is differently constructed.

Zhang’s pluralistic cultural epistemology has no room for truth that transcends all cultures, or for the idea that there is a universal criterion for knowledge, such as the correspondence of a proposition to external objects. Knowledge is always mediated through culture. Knowledge and truth are functions of established criteria within a specific cultural epistemology. Further, there is no way to approach “reality” that is free of the cultural constraints determining what one is looking for, what questions one asks, or what is taken as sufficient evidence for a belief.

3. Moral Theory: Fundamental Questions on Morality

Moral theory and ethics are concerned with questions such as these: How should we live? Is the ultimate purpose of our lives to pursue happiness or pleasure, obey moral rules, please others or higher beings, or follow our own interests? Insofar as the origin of our morality, do we invent morality and agree to it, is it inborn or part of our nature, or is it given by a higher being or intelligence? Is something good or right to do depending on the consequences of the action, our duties, or our passionate feelings? Is morality universal, or relative to its culture or the individual? Are the most basic and important things in morality the actions we do or the sort of persons we are? Many of these questions are addressed directly and indirectly throughout the history of Chinese philosophy.

a. Confucius (551-479 B.C.E.): the Exemplary Person Ideal

The first access that most people have into Chinese philosophy in general, and certainly into the thought of Confucius, is through the Analects (Lunyu). This work is an anthology of selected sayings in which Confucius is often the main teacher. When speaking of morality, the term Confucius uses that is perhaps the closest in meaning is li, often translated as the rites that guide conduct. Li refers to the manner of comporting oneself that helps people transcend animality, develop humaneness (ren), and even exceed present ways of being human by raising themselves to higher expression. This expression is captured by the concept of “exemplary person” (junzi).

In the Analects, the humane (ren) person is able to endure hardship and enjoy happy circumstances (4.2), identify good and evil (4.3), and be free from the desire to do wrong (4.4). Being ren comes through self-cultivation and observing li, and it cannot be reduced to the dichotomy often found in Western moral theory between action (doing) and character (being). Confucius recognizes the importance of both what persons do and the sort of person one is. A person of ren character will act in a certain way; the construction of this character cannot occur without doing the li acts derived from and embodied in the lives of persons who have gone before as our exemplars (junzi). Making oneself into such a person is the work of self-cultivation.

There is no single word in the Analects for self-cultivation; but as a concept Confucius teaches, its imprint is present in the earliest stratum of his teachings. In thinking of the dedication and commitment needed for cultivating oneself, Confucius calls on his disciples to give their utmost (zhong) (3.19). Self-cultivation is not simply learning from books; it includes character development, enhancing talents, and refining (wen) one’s humanity itself (5.15). Cultivating oneself into an exemplary person is never merely reduced to one’s moral actions or values. Confucius recognized that in the activity of self-cultivation everyone makes mistakes, but he taught that it is tragic to repeat a mistake or fail to reform after making one (9.25). Confucius thinks of human being development as taking a raw piece of jade and carving and polishing it until it is fully refined (9.13).

Six of Book 4’s analects specifically describe the exemplary person. Such a person always does what is appropriate (yi) (4.10, 16), cherishes moral excellence (de) (4.11), and is not driven by desires (4.10). Exemplary persons take the high road, not the low one (14.23), and they feel ashamed if their high-sounding words are not fully reflected in their deeds (14.27). Indeed, exemplary persons cherish their excellence of character over power, land, or thought of gain. Exemplary persons take as much trouble discovering what is right as lesser men take to learn what will pay (4.16).

Confucius’s teachings on the exemplary person and self-cultivation are the touchstone for the moral and human ideals of Chinese culture down to the present day.

b. Mohist Moral Philosophy

The doctrine of “Inclusive Concern” (jian ai) is the best known of all Mohist teachings. Mozi took the position that in order to achieve social order people must be concerned for each other, showing care for others and not merely for themselves or their own families. This position was used by the later Mohists to criticize what they took to be the Confucian view; namely, that one has moral responsibilities and duties only to those to whom one is related (that is, the Five Relationships of Confucianism—ruler/subject; parent/child; spouse/spouse; elder sibling/younger sibling; and friend/friend). Practically speaking, jian ai meant that in relationships with others, people should seek mutual benefit and express mutual respect.

Mozi understood the chief problem of humanity in its “state of nature” to have been a world of plural moralities and competing values that eventuated in disorder, selfishness, and evil. So, he argued that a coherent social order must rest on a common and coherent morality that is absolutely and universally true. He held that if pluralism of moral values is allowed to exist, conflict would be the inescapable result. Accordingly, two preeminent philosophical questions occupied Mozi. What is the source of true morality? What is the content of true morality?

Mozi praises Heaven (tian) as impartial, generous, wise, just and caring, and regards it as the source for true morality. Heaven cares for humans and benefits the worthy by providing resources and blessings, while judging and punishing the wicked. Heaven has a dao that orders all things, including its relations with humanity.

Mozi finds the reliance on elitist consensus as the source for morality, which he associates with Confucius and the ideal of the exemplary person (junzi), to be both unconvincing and flawed. He takes the view that even if a practice is traditional (for example, received rites such as li) it is not necessarily morally right. He makes a distinction between custom and morality, associating the Confucian li with custom, while advocating objective moral standards coming from Heaven that he calls fa. For such moral norms, the analogy Mozi uses most often is the plumb line or the L-square. He points out that the function of these tools is to guide the performance of work. They are reliable, objective, and even the novice can employ them. The function of Heaven’s standards is to provide an absolute and universal guide for human life.

As for how we know the content of the Heaven’s true morality, it is mediated through the ruler to the layers of hierarchy down to the people. Specifically, Mozi argues that in human society prototypes, exemplars, and role models that exhibit correct judgments and true morality do so because they are following the will of Heaven or the standards (fa) specified by the Son of Heaven (ruler) who perceives and understands the divine source of morality. While following these standards will yield the best and most efficacious results, Mozi is not strictly a utilitarian. He does not say that examining or quantifying the desirable consequences of an action determines moral right or good. Good outcomes result from following Heaven’s Way and represent confirmation that the standard is from Heaven, but the consequences themselves are not the source of morality’s content.

c. Lao-Zhuang Traditions and wuwei

The Daodejing teaches that when individuals try to make something happen in the world by their own reasoning, plans, and contrivances, they inevitably make a mess of it. But if they take their hands off the course of their lives and move with the dao, then it will untangle all life’s knots, blunt its sharp edges, and soften its harsh glare (DDJ 56). This is relevant to an understanding of Lao-Zhuang teachings on morality because moral distinctions are regarded in this tradition as the kind of tampering and “trying to make something happen” that is warned against.

In Chapter 18 of the Daodejing, the ancient masters have transmitted the teaching that it was only when persons abandoned oneness with dao that they begin to make distinctions in morality. The Daodejing makes this point by specifically mentioning in a critical manner several of the distinctions made in Confucian moral and social philosophy: humaneness (ren); appropriateness (yi); filiality (xiao); and kindness (ci) (DDJ 18). If humans had continued in their primal oneness with dao, they would not have needed to invent such moral discriminations. So, in the Lao-Zhuang traditions there is a call to return to human inner nature that moves with the dao and away from the conventions of morality.

In the Zhuangzi, making distinctions of these sorts is considered a disease that is condemned in several logia of the text (ZZ chs. 2, 5). In the Lao-Zhuang traditions, struggling over these human-made distinctions represents the source of all strife in the world. The key is not to begin this process at all or to empty oneself of it by forgetting such distinctions and returning to the unity with dao, expressing its power (de).

For both the Daodejing and Zhuangzi, the concept wuwei is used to report a kind of effortless, spontaneous conduct that invariably expresses moral efficacy without deliberation or calculating consequences. This is not an ability that is available to persons without preparation. A person caught up in making moral distinctions should not expect to be able to wuwei (as a verb) without first entering into oneness with dao by forgetting those very distinctions. As the Daodejing says, “The [persons who possess the] highest de (virtuous power) do not strive for it and so they have it (DDJ ch. 38).

The holiness of wuwei conduct rests on the fact that moving in this manner accords in the situation with an efficacy that can only be attributed to the dao; it could never have resulted from human wisdom, planning, or contrivance. This is not to say that such action might not correspond to conventional human moral belief. Rather, the point is this: While moving in wuwei may look to the outside observer like moral conduct following human distinctions, its origin lies in empty stillness. It is a hopeless pursuit to invert this process and think that by following human morality one will come upon the dao or be able to wuwei.

The Zhuangzi compares the spontaneous and effortless action of wuwei to the kind of prehension Cook Ding experiences when he cuts up an ox without ever hitting a bone or dulling his knife (ZZ ch. 3). Zhuangzi’s disciples also gathered several stories about conduct analogous to wuwei. For example, stories of extraordinary swimmers and divers; the ferryman in the gulf of Shangshen who handled his boat with commensurate skill; the amazing cicada-catching hunchback; Ji Shengzi, the game cock trainer for King Xuan; Bohun Wuren’s skill in archery; Qing, who makes bell stands that seem to be the work of the spirits; and Chui, the artist who can draw free-hand as true as a compass or T-square (ZZ ch. 19).

d. Mencius (c. 372–289 B.C.E.): Morality as Cultivated Human Nature

The Mencius text records Mencius’s position that humans are distinct from other sentient creatures in having a “moral lens” made up of four propensities (siduan). Another way of saying this is that humans moralize in a way analogous to how that a corn kernel yields corn and not tomatoes. Mencius means that humans do not start out as blank slates having to learn to moralize. All humans must learn specific moralities, but they are enabled, and even inclined, to do so because of the four propensities that he calls “seeds”. For Mencius, humans are good by nature. This view marks the beginning of his philosophy of anthropology.

When reading Mencius, the early Chinese ontology that he inherited must be kept in mind. For him, there is no object that is a self or soul as found in Western philosophy. There is no identifiable “humanness” that sets humans apart from animals. Nevertheless, there is a sort of five-phase correlation of qi that has produced a human rather than something else. The four propensities are part of this structure, and they may be stated as follows: One whose heart-mind (xin) is devoid of compassion, shame, courtesy and modesty, and moral discretion is not human (Mencius 2A6).

The fact that Mencius chooses agriculture metaphors when writing about human nature suggests he is being consistent with the early Chinese ontology that influenced him. The Chinese graph for nature (xing) is related to a word meaning “to be born” or “to live/grow” (sheng). Thus, “nature” can refer to the defining characteristics of a thing, but it can also refer to the characteristics of a thing that will develop over the course of time if given a healthy environment.

Chinese philosophy does not insist on a thick understanding of essentialism. Yet, this does not mean that people are born without generally defining propensities. There are inborn, transitive, generational patterns that create bodies. To be devoid of these or possess some other set might eventuate in some other creature, but not a human body. Likewise, for Mencius, anyone devoid of the four propensities of morality lacks a human nature (xing) and cannot become human.

Mencius’s position cannot be falsified by human wrongdoing. He does not mean that humans are innately programmed to be morally good, or that they will automatically grow into morally good beings. The kernel will produce corn, but not if it is deprived of cultivation. Likewise, human nature is predisposed by means of inborn tendencies to act morally, but being morally good is not automatic.

Evil and violent times can retard the youth, just as drought can harm the crops (6A7; 6A9). The great and luxuriant trees of Ox Mountain are beautiful, but if constantly lopped by axes, we cannot be surprised if the mountain appears bald and ugly. The same is true of a person who repeatedly cuts down the sprouts of his moral intuitions and follows a way of immorality (6A8). On the other hand, Mencius thought that the incipient seeds of morality would grow, with cultivation by li, into the humane person (ren).

The cultivation of these seeds enables a person to increase in humaneness (ren) just as a fire that continually builds or a spring that has begun to vent will flow ever more strongly (6A6). Mencius believes a person can, by virtue of cultivating his inborn moral endowments, find a special kind of energy that he calls “flood-like energy” (haoran zhi qi), which brings both delight over one’s decisions and power to continue performing virtue. In taking this approach, Mencius is making the difference between his position and that of Mozi very clear.

e. Xunzi (310-220 B.C.E.): On the Carving and Polishing of the Human Being

Unlike Mencius, Xunzi believes that human nature is disposed to self-interest and that, left alone without moral guidance and the restrictions of law, self-interest will degenerate into selfishness and breed disorder and chaos. Goodness will not grow from within like corn stalks from kernels because human inclinations are not the four propensities Mencius identified, but desires for beautiful sights and sounds, comfort and power. Unless controlled, these and other desires become violence, willful violation of others, and destruction.

Xunzi says that the sage-kings established moral rites, such as discriminations of right and wrong, and li, to shape, guide, and control people. For Xunzi, human beings invented morality; they did not discover it within Heaven (Mencius) or have it disclosed to them by Heaven (Mozi). Accordingly, if the sage-kings had not invented the rites, there would have been no civilization and no order. Subsequent generations must be transformed by the influence of teachers and models, and follow especially the guidance of morality and rituals of human conduct (li) handed down to them.

Humans depend on the rites of morality created over generations by exemplary humans to shape and carve individual being into something worthwhile. A way of extending the importance of this difference between Mencius and Xunzi is to notice the shift in metaphors that Xunzi makes. Where Mencius used agricultural metaphors, Xunzi employed craft analogies: woodworking, jade carving, home construction, and so forth. For Xunzi, humans by nature are like warped pieces of wood that must be steamed, put into a press, and forced to bend into a straight shape.

He holds that even children must be taught to love their parents and be filial, a position contrary to that of Mencius, who thinks this is a natural inclination. Xunzi believed that if Mencius was correct and human nature was such as to move persons toward the good like water flowing downhill, then there would be no necessity for the emergence of morality or li (Xunzi; Watson 1963: 253).

In Chapter 17 of the Xunzi, Xunzi makes the point that Heaven does not care about human behavior, or how the course of things affects humans. In this, he takes a view much different than that of Mozi. Heaven cannot be appeased or persuaded to bring humans good fortune. If there is good fortune for humans, it is because persons make it happen through responsible government and well-ordered society. Neither does Heaven make people poor or bring calamities. Heaven has no will and no mind, and thus does not act to bring judgment or reward. The well being of persons and societies is squarely in the hands of humans acting morally. All are contrary to Mozi’s view.

f. Buddhist Moralities in the Chinese Context

i. The Way of Precepts

In Chinese Buddhism, the moral life is understood in a way similar to the epistemological one. There are multiple levels. On the lowest level, that of the lay followers, Buddhist morality looks in many ways like a conventional moral system. Various Buddhist schools share the basic code of ethics called the Five Precepts for the guidance of life when a seeker is at this lowest level. These entail abstinence from (1) killing, (2) stealing, (3) sexual misconduct, (4) lying, and (5) intoxication. Some Buddhist schools add three or five precepts to these. The so-called Ten Precepts form the conduct guides for monastic orders. The best-known companion concept to Buddhist morality at the level of precepts is the concept of karma. Karma may be regarded in its most basic sense as the product of one’s past actions. These products may be behavioral consequences, mental conditions or physical states that result from one’s acts.

Individuals living by moral precepts may stand out among others as good and ethical. They may receive awards and recognitions. We may seek them out in our relationships. In its highest forms, this is the Buddhism of compassion for the world, which seeks to remove evil and suffering by living a pure life and contributing to the welfare of others.

However, while such persons are still thinking of life and existence under moral precepts, they remain “in training” and in bondage to volition, names and forms of discrimination that persons use, and desire and suffering; even the desire to be good may cause suffering. They are also still subject to mental anguish and physical attunement.

A higher level of morality than that of following precepts is possible, even as a result of following those precepts. However, a crucial difference occurs when the training eventuates in enlightenment. In nirvana’s extinguishment of desire and vanishing of suffering, all moral precepts may be dispensed with as well. One who has climbed to the heights no longer needs the ladder. Since one is emptied of the attachments and desires moral precepts are meant to control and erase, there is no longer any need for them, nor any function for them once the job is done and desire is extinguished. Such an enlightened one transcends ethics and precepts, and is set free from morality.

ii. The Hua-yan (Flower Garland) Bodhisattva

Hua-yan (Flower Garland) Buddhism valorizes the form of existence known as Bodhisattva. To be a Bodhisattva is to dwell in the margins between experienced enlightenment and surrounding moral and karmic views. The Bodhisattva has already abandoned desires and the discriminations of the mundane world that are the cause of suffering. Accordingly, such the Bodhisattva dwells in this world with a mind that transcends that which causes suffering and has no attachment to the self. Those still caught in this world are attached to the self and to the discriminations of existence, and they suffer because of the desires such attachment creates. When a Bodhisattva lives among such people, the difference is obvious and the other sentient beings see that the Bodhisattva does not suffer. Thereby, the Bodhisattva becomes a savior.

iii. The Way to Morality in Chan Buddhism

Better known in the West by its Japanese equivalent, Zen, chan comes from the Sanskrit term dhyana, which means “meditation.” Although meditation is not the only practice employed in Chan Buddhism, its central role in ethics is important. In Chan Buddhism, the task is not to use reason or a calculus of consequences of actions in order to arrive at knowledge of one’s duties, but rather a person readies himself to act morally through meditation. This state is empty of content such as rules and duties. Indeed, one who practices this as a component of ethics does not say that he “knows” what to do or what is right. Rather, one has set aside the need to speak of the ethical life as connected to moral knowledge. In this state, persons have no need to draw their bearings from culture, community, or any sacred book. For such persons, meditation is the key. It is a sort of alternate consciousness that will enable one to act spontaneously, without calculation or feelings of resistance from the will. Similarities between Chan Buddhism and the concept of wuwei in Lao-Zhuang provide the backdrop for many historical instances of contact and exchange between the traditions.

g. Zhu Xi: Fashioning the Human Being

Whereas Western philosophers often engage in a discussion of the ultimate meaning or goal of human life, frequently associating it with happiness, Zhu Xi identifies the fundamental purpose of human life and its moral objective as equilibrium and harmony (zhonghe). For Zhu Xi, when humans realize equilibrium and harmony to the highest degree, heaven and earth will attain their proper order and all things will flourish. Accordingly, the purpose of morality is self-mastery by yielding to the Principle(s) (li) underlying reality. It is never merely self-realization. Rather, the person of ren (humaneness) “forms one body” with all things. Those who make a cleavage between objects and distinguish between the self and others are petty persons; that is, xiao ren.

Zhu’s ontology is closely connected with his approach to morality. Rather than taking the view that human nature is good or evil, his position is that owing to the way the five phasal elements come together to shape humans, one will be enabled to express the principles and patterns of Heaven. That is, one will be a sage or an evil person, mentally deranged, or a genius (Conversations 4.13, 15). Zhu thinks that he has thereby resolved the philosophical debate between Mencius and Xunzi.

Harmony for Zhu Xi is not so much “knowing yourself,” as Socrates would have it, nor is it identical with Aristotle’s eudaimonia (human flourishing). However, Zhu may well have taken the position that he (harmony) is necessary to both the Socratic and Aristotelian project. Yet, we may wonder whether harmony is a sufficiently robust and satisfying moral ultimate for human life. The question, then, is whether this calls us to the highest levels of achievement as humans.

h. Wang Yangming: Moral Willing as Knowing

While Wang Yangming was critical of Zhu Xi’s thought, he was influenced by the Neo-Confucian thinkers who went before him. Wang adopted their vision that the great man can regard Heaven, earth, and the myriad things as one body, holding that one does so not because he rationally decides to, but because it is natural to his heart-mind (xin)

In contrast to Zhu’s stance of “forming one body” with all things, Wang holds that the “great person” moves by pure intelligence (liangzhi). There is a direct awareness of being one with those in need and acting on that awareness. For Wang, “awareness” is not simply “feeling” or “reason,” which form the usual extremes in Western thought. Rather, feeling and reason are combined as in the Chinese notion of “heart-mind” (xin). This meaning of awareness gives the agent a unifying perspective for experiencing and dealing with all persons, things, and events.

Wang thinks that the direct awareness of Heavenly Principle(s) (tianli) as a moral guide is discovered not by following a moral exemplar, obeying a divine command, or by utilitarian quantification of what action will yield the greatest happiness for the greatest number. Neither does it come into view at the end of a rational process of solving a dilemma one might face. Rather, awareness of tianli is discovered by introspection.

For Wang, the experience of moral enlightenment in liangzhi transforms desire and affections so that individuals freely act. By acting freely, the Way is known. This is the crucial point. Wang is not saying that people who know what is the right thing to do must use their will to redirect their desires and passions in acting upon this knowledge. Rather, he is saying that the transformation of the will is knowledge of the good (Instructions for Practical Living sec. 5).

Yet, we may ask how to distinguish choice through liangzhi from simply “doing what one wants to do” or what “one’s conscience tells one to do”. Wang anticipates this criticism by insisting that, while liangzhi is inherent in all persons, it is the distinguishing characteristic of the mind of the sage. As one prepared by study and deep reflection, the sage’s grasp and awareness of liangzhi is beyond the ordinary. So, one who does not practice like a sage cannot hope to experience his or her own internal powers of liangzhi.

i. Mou Zongsan (1909-1995): Moral Metaphysics

Mou Zongsan coined the term “moral metaphysics” and understood this activity to be primarily occupied with the most basic existential inquires of humans, such as “What should I do?” and “What makes my life meaningful?” Mou argued that in doing moral metaphysics one must notice a two-directional movement between the human and Heaven. He thus used the concept to focus on the transcendent sources of morality. Mou borrowed the philosophical framework of German philosopher Immanuel Kant (1724-1804), but offered his reading of the Neo-Confucians as a corrective to points he believed Kant had gotten wrong.

He understood Kant’s view to be that morality was a priori. In the Groundwork on the Metaphysics of Morals (1785) and The Critique of Practical Reason (1788) Kant developed a method for identifying the pure moral duties of humans by subjecting any candidate’s duty to what he called the Categorical Imperative: “act always on the maxim which you can will to be universal law.”

For Mou, the Neo-Confucian understanding of equilibrium (zhong 中) in the heart-mind (xin) of every person, where the Principle(s) of Heaven were known immediately, was to be preferred to Kant’s Categorical Imperative.

Mou realized that his position required one to commit to a metaphysics (ontology) in a way that Kant’s did not. This is one of the reasons he inverted Kant’s way of speaking about “the metaphysics of morals,” by which Kant meant to identify the presuppositions for morality as we have them.

According to Mou, the Principle(s) of morality could be apprehended by a direct, immediate awareness of the heart-mind, not by the use of Practical Reason as Kant argued. Moreover, he held that creative free action is a manifest reality in the lives of the sages, not merely a postulate of pure practical (moral) reason as Kant held (Mou 1968: 10-13; 43-5). Mou argued that the sages had also connected the finite (what Kant called the phenomenal world) with the infinite (the Principle(s) or what Kant called the noumenal world).

In Kant, the highest good is when happiness occurs in exact proportion to virtue. But Kant said the confluence of optimal virtue and happiness does not and cannot occur in this world. So, morality requires that we postulate both an immortal soul and a God who is able to bring virtue and happiness together. Mou objected to this analysis in Kant, because he thought that personalizing the process that brings virtue and happiness together only pushed the problem to another level. Mou did not see how postulating God provides assurance that virtue and happiness would coincide. How would we know that God would wish to bring virtue and happiness together? Mou preferred another resolution; he held that the concrete example of the sages proved that Heavenly Principle(s) can be manifested in human practice and need not require postulation of an afterlife. Additionally, he held that the sages had lived lives of happiness and virtue, eliminating the grounds used by Kant to postulate both immortality and God.

4. Political Philosophy: Fundamental Questions on Society and Government

Political philosophy is concerned with questions such as these: Prior to government and law, was humanity’s natural state one of freedom and equality, independence or sociality? Were humans inevitably in conflict or did they live in innocent bliss? Does government arise from a contract between persons, the recognized superiority of some persons to lead, or the decree of a higher power? Do we arrive at human laws by participatory exchange of views, do they derive from the nature of reality, are they codifications of the lives of exemplary persons, or are they decrees of rulers or a divine being? What is the best form of government? What is the purpose of government? Are there checks and balances on governments and rulers? Is revolt against the ruler or government ever justified? What is the proper balance between governmental authority and individual liberty of expression and thought? What is the role and responsibility of government to implement justice? In distributing goods, for example, are there rules of entitlement, fairness, equality of opportunity?

a. Confucius on Rulership and the Nature and Function of Politics

Rulership and governance is a principal theme of the Analects (Books 2, 11, 12). Indeed, several of Confucius’s disciples were apprenticed to him to learn the skills and wisdom necessary to become ministers and rulers. The most fundamental characterization of Confucius’s view of rulership is that he believed in a meritocracy. Rulers should ascend to power based on their merit, not their heredity or as a result of having won an election. Confucius’s meritocratic theory is not necessarily anti-democratic, but neither does it elevate democratic process above the higher value of the ruler’s character. Further, there is much in the Analects that suggests Confucius believed that common persons of his day were not prepared or able to participate in government.

Rulers should be exemplary persons, and those who possess virtue (de) will have no difficulty with their people or their kingdom. The classical Confucian ideal is expressed as “nei sheng wai wang” (“internally a sage, externally a ruler”). In an exchange with Ji Kangzi, Confucius says that the ruler who is an exemplary person can affect the entire kingdom with appropriateness (yi) and moral excellence (de), like the wind that blows over the grass (12.19).

Confucius recognizes the need for civil law to extend beyond the rites of propriety and morality (li). However, he also believes that leading the people by political measures and keeping them in place by civil law cannot ensure that the people will develop a sense of shame. Therefore, the measures and law are not sufficient to guarantee order. In contrast to such a style of rulership, a lord who can lead the people by means of his own virtuous power (de) will create a citizenry of honor and virtue. Most importantly, the lord’s governance will create trust (xin) between him and the people.

One of the classical “five relationships” of Confucianism is the relationship between rulers and the people being analogous to a family dynamic. The exemplary ruler will treat the people as though they are his children. Such filial conduct coming from the ruler will create among the people a sense of trust in the king and ministers. Confucius employs the concept of trust instead of “contract” or “agreement” concepts that have been the bedrock of Western political models since the 18^th century.

Confucius did not believe that any given ruler had a “divine right” to be king. Rather, he held that the ruler that shows evidence of proper conduct––namely, self-cultivation and implementing corrections to real or potential harms to the people––would earn him the right to rule. As a result the ruler would win the peoples’ respect and loyalty. “If proper in their own conduct, what difficulty would they have in governing? But if not able to be proper in their own conduct, how can they demand such conduct from others?” (Analects 13.13)

In contrast with minimal intrusion and maximal liberty that characterize Western civil libertarian models, a properly governed state is a value-laden one that produces an environment in which each person may achieve self-cultivation. When listing the tasks of government in order of importance, Confucius names cultivating the trust of the people first, then provision of food, and lastly security and defense (12.7). Western civil libertarian systems, for all their strengths and values, are not necessarily committed to the goal of creating an environment for self-cultivation. They may maximize liberty, but increased freedom does not equate to self-cultivation. For Confucius, politics is rectification or correction (zheng zhe, zheng ye) (12.17). The purpose of politics is to correct deficiencies or mistakes that impede the self-cultivation of each person. While taking a vote may resolve a policy question in participatory governments, it does not actually guarantee that the result is right and correct, which is one reason why Confucius looks to the exemplary leader rather than other models such as democracy or parliamentary debate.

While Confucius holds that filial respect should guide citizens’ conduct toward their rulers, he does not advocate blind obedience. Both Confucius and Mencius state that showing remonstrance with rulers is the responsibility of all who want a truly humane society.

b. Political Philosophy in the Mozi

The basic project of the Mohists was to establish a morally founded social order based on the will of Heaven. To do this, Mozi advocates a system of political hierarchy with the ruler at the top. Even so, the will of Heaven remains as the polity’s governing principle rather than the ruler’s will, his own personal gain or a democratic decision. According to Mozi, the function of this principle is to care for the people universally (jian ai) and benefit the people according to their needs.

The reason Mozi insists that government be structured in a hierarchy under Heaven’s instruction is his view of what Western philosophy has called “the state of nature;” a time before the existence of government. Mozi holds that in that state, there existed a multiplicity of moralities and values. Such pluralism and relativism did not lead to a social contract; rather, it turned people against each other. Only a true and absolute moral system given by Heaven can overcome such relativism and its resulting conflict.

Mozi states that the one who is the most worthy and understands the Way of Heaven (tiandao) is selected by Heaven and established as the Son of Heaven (Mozi 11.2). The worthy ruler is someone who has Heaven’s intention, just like wheelwrights have compasses and carpenters have squares. Wheelwrights and carpenters use their compasses and squares to evaluate circles and squares in the world, claiming that what conforms is right, what does not conform is wrong. The ruler governs by the standards of Heaven. The result is that unity of the people under a set of laws or principles does not come by mutual agreement but by the silencing of divergent points of view under a ruler who enacts the will of Heaven.

What is interesting about Mozi’s theory is that the common people should not only exhibit strict obedience to the rulers, but they should emulate them in their behavior. This is called “exalting worthiness,” and it represents Mozi’s view on the strength and purpose of a political community (Mozi 8.7). He argues for a strong version of political authoritarianism; a centralized state with a hierarchical, tightly organized bureaucracy. This structure, properly conceived of, will lead to the benefit of the people.

c. Mencius’s Political Philosophy

Mencius’s political philosophy is often neglected in a study of Chinese thought. Unlike Mozi, who explored the origin of government and offered a kind of “state of nature” argument for its emergence, Mencius does not speculate about the circumstances that gave rise to government. Instead, he is interested in providing a robust constructivist philosophical ideology called “benevolent government” (renzheng).

Mencius was well-placed to write such a theory. He was a shi, or scholar who traveled from state to state, seeking to be a ruler’s political advisor or military strategist. These traveling advisors often had a significant influence on the ruler, and some of them even became powerful high-ranking officials.

For Mencius, a ruler who practices benevolent governance should do at least the following things: reduce punishment and taxation (1A5), rejoice with his people (1B1), make sure that the masses are neither cold nor hungry (1A7), take no pleasure in executions or war (1A6), let no one starve to death (1A4), and take care of four types of people who are the most needy; widows, widowers, old people without children, and young children without fathers (1B5). In giving advice to King Xuan, Mencius makes clear that he is following Confucius in holding that the state should be ruled by the virtuous, not by those who are elected by the people or inherit rulership by family lineage.

Mencius thinks it is the obligation of government to ensure that the basic needs of the people are met. Today this would be called the provision of social goods or secondary goods, in contrast to the primary goods of liberty and freedom. He is not intent on teaching that the role of government is to maximize civil liberty. He provides specific advice about how the state should help secure the livelihood of the people, including recommendations about everything from tax rates, to farm management, to the pay scale for government employees (3A3). Mencius also agrees with Confucius that self-cultivation is crucial both for the individual and for society. So he advocates an educational system in the ideal state that would instruct people how to be responsible in their relationships as parent, child, ruler, minister, spouse, and friend (3A4).

According to the Mengzi text, Mencius touches upon the removal of the ruler on several occasions. He says that ministers, but not the common people, should not hesitate to depose a ruler who repeatedly refuses to listen to admonitions against serious mistakes (5B9). Speaking of historical instances in which rulers were removed, Mencius says that a sovereign who mutilates benevolence (ren) or cripples rightness (yi) is an outcast, even if he is an emperor (1B8). If the king is not humane, and if he abuses the people instead of taking care of their welfare, he can be legitimately deposed.

d. Lao-Zhuang and Yellow Emperor Traditions on Rulership and Government

The logia of the Daodejing make it clear that reality left alone moves as it should, and that it is human tampering with relationships and attempts to guide and orchestrate things that make a mess of life. Morality and law are evidences of such tampering. “Only when the great Way (Dao) is abandoned, do morals and laws of benevolence and righteousness arise” (DDJ 18). “The more taboos and prohibitions there are in the world, the poorer the people… The more laws and edicts, the more there will be thieves and robbers.” (DDJ 5). Ideally, the follower of the dao will not engage in rulership or political machination at all because there will be no need to do so. In fact, human efforts to manage life through law, morality, and governmental policy only make matters worse (DDJ 29). In this connection, the text is famous for its aphorism that ruling a state is like cooking a small fish (DDJ 60), the point of which is that the least amount of tampering is best, as though the ruler should allow the Dao to take its course without manipulation by government.

There is no question that the Lao-Zhuang philosophical tradition wanted to reduce governmental control. “The more dull and passive the government, the more honest and agreeable the people. The more active and [interventionist] the government, the more deformed and deficient the people’s lives will be” (DDJ 60). Rulers should be well within the background and not seek a name for themselves. “The greatest of rulers is but a shadowy presence; next is the ruler who is loved and praised; next is the one who is feared; next is the one who is reviled” (DDJ 17).

Chapters 1-7 of the Zhuangzi tend to continue these same emphases from the Daodejing, calling governing by law and structure “a bogus virtue,” as futile as drilling into a river (ZZ Ch. 7). These chapters make it clear that Daoist masters did not seek, and even actively avoided, positions as officials or rulers. Chapter 28 contains a long series of text logia all dealing with rulership, designed to show that when they were approached with the offer of political employment, famous Daoist masters refused it, fled into far regions, or even attempted suicide.

However, in the Zhuangzi’s Yellow Emperor-Laozi Daoist (Huang-lao) lineage materials (ZZ Ch. 11-16; 18-19) a different view of rulership is expressed. These sections do not recommend turning away from political involvement. Instead, they say that in the early period of his rule the Yellow Emperor used the Confucian virtues of benevolence (ren) and righteousness (yi) to meddle with the minds of men. What followed was a history of consternation and confusion, all the way down to the Confucians and Mohists who are mentioned by name. However, the Yellow Emperor then underwent a transformation and learned the “Perfect Dao” on the Mountain of Emptiness and Identity (Kongtong). When he returned to rule and followed wuwei, his kingdom became peaceful, and he became an immortal transcendent (xian) (ZZ Ch. 11). The views of the Yellow Emperor-Laozi tradition are more developed in the important work Masters of Huainan (Huainanzi), edited under Liu An and presented in 139 B.C.E. to the Han imperial court.

e. Legalism and Hanfei (280?-230? B.C.E.)

The term “Legalism” or the “Legalist School” (fa jia) first appeared in Sima Qian’s Records of the Historian circa 90 B.C.E. Traditionally, Guan Zhong (d. 645 B.C.E.?) is called “the father of Legalism”. However, calling “Legalism” a school is somewhat misleading because there was no “school” of this thought per se; it was more of a philosophy of law and its practice. A number of philosophers associated with this approach were active in government as ministers, officials, and imperial consultants. For example, Shang Yang (d. 338 B.C.E.) was a chancellor of the Qin state and Shen Buhai (d. 337 B.C.E.) held a similar position in the Han state. Hanfei (280?-230 B.C.E.) was an advisor in the Han state, just prior to its annexation by Qin in creating China’s first empire in 221 B.C.E. It is generally acknowledged that Qinshihuang (birth name, Ying Zheng, 260-210 B.C.E.), the first emperor of China, as well as advisors such as Li Si (280?-208 B.C.E.) followed Legalist writings in unifying the diverse states of China into an empire. Possibly, they followed a version of the text called Hanfeizi.

Hanfei shared a view of human nature somewhat similar to that of Xunzi. He thought the natural aspirations of the people are such that they all move toward security and benefit. Xunzi held that public-spirited people are few while private-minded individuals are numerous. While this is not a well-developed theory of the “state of nature,” it was adequate to pose the problem faced by explaining where the state comes from and what is its necessity for Hanfei. His recommended solution for “private-mindedness” is the establishment of government.

Still, Hanfei does not mean that human nature is evil. He simply means that humans give primacy to their own self-interest. The carriage maker hopes that men will grow rich and eminent so that he can sell carriages. The coffin maker wants persons to continue to pass away, so that he can stay in business, but not because he is evil or wishing others bad fortune.

Hanfei has a deep appreciation for the power of socio-economic forces on the life of humans and any society they create. He is not a complete economic determinist, but he feels that resources and scarcity play a role in the extent to which one will adhere to social order. In taking this position, Hanfei anchors his political theory on the belief that human action is a by-product of the socio-economic environment in which persons live. So, creating a state in which the resources are sufficient, available to all, and fairly distributed is the single best way to encourage moral goodness, peace, and societal harmony.

This means that if a ruler wants and needs his people to work diligently, he must motivate them by an appeal to their self-interests. Moreover, the skillful ruler should set up policies and administer the state so that an individual’s maximization of his own self-interest will also enlarge the public interest and the state.

Unlike Confucian, Mohist, Mencian, and Lao-Zhuang traditions, an ideal for Hanfei state does not depend on having a virtuous ruler. Even a ruler who is morally deficient in his own personal life may, nevertheless, be a good ruler if he sets up the proper policies and administration by means of five tactics: the use of the power of position; the employment of administrative methods; the making of laws; taking hold of the two handles of government (reward and punishment); and the non-action (wuwei) of the ruler.

Hanfei’s separation of politics from morality is an approach that earlier Chinese philosophers would not have accepted. To put it succinctly, while previous classical Chinese political philosophies insisted on rule by the virtuous (for example, a meritocracy) and a close association between morality and politics, Hanfei sees no difficulty in considering both the ruler and politics as amoral.

f. Political Thought in the Han Dynasty (206 B.C.E.-220 C.E.)

i. Dong Zhongshu (179-104 B.C.E.)

When Han Emperor Wu took control of the state, he consulted scholars and officials to gain advice on how to govern. Dong Zhongshu wrote several documents designed to reform government that used Confucian ideology paired with an ontology of the resonance (ganying) between human action and Heaven’s activity; that is, acting morally good will result in Heaven’s blessing in health, position, and longevity. Dong recommended the establishment of a Grand Academy (taixue) to train those who would serve the government in the skills they would need.

His most important works are contained in the Luxuriant Gems of the Spring and Autumn Annals. Dong continued the emphases of Confucius and Mencius calling for rule by the meritorious and for the establishment of a humane (ren) government. A principal difference between Dong and Confucius and Mencius is that he attached more significance to the role of Heaven in validating policy and social structure as a transcendent power. The ruler and his ministers were subject to the authority of Heaven, and their task was to do good and proper acts, setting up the kind of resulting resonance that would be seen in Heaven’s blessings. Violation of the principles of Heaven would bring disturbances in the natural, human, and spiritual worlds.

Dong built his philosophy on a much heavier reliance on the transcendent than can be seen in Confucius, Mencius, or Xunzi. Rulers must follow the principles of Heaven and fulfill its mandate, or else disaster will follow. He even speaks of Heaven as the “great grandfather” (zeng zufu) of humanity. And yet, following Confucius, Dong insisted that in order to carry out the will of Heaven, a ruler must rely on education and the rites rather than punishment and killing.

Applying the explanatory system of the five elemental phases, Dong wrote that rulers should practice, and the state should inculcate, the five virtues: humaneness (ren), rightness (yi), propriety (li), wisdom (zhi) and loyalty (xin). Dong believed strongly that all political activity should reflect the five phases. To be in accordance with these phases, he even called for a new calendar to be issued, colors of banners to be changed, monuments redesigned, and complete revision of other trappings of government.

ii. Masters of Huainan (Huainanzi 139 B.C.E.)

According to his biography in the Book of the Early Han (Hanshu, 44.2145) Liu An, the king of Huainan (in modern Anhui province) and uncle of Han Emperor Wu, gathered a large number of scholars and practitioners of esoteric techniques to Huainan in the period 160-140 B.C.E., and supported them in the creation of written works synthesizing their views. The Masters of Huainan (Huainanzi) was a product of this interchange of ideas.

It is a work focused on educating a ruler on the tasks before him. In the text we find a theory of the fall of humanity from an original harmony in the state of nature to human government and politics with its attendant disorder and violence. Instead of government resulting from agreement between persons for whom there is no law, where the powerful can enforce their will over the weak, the text takes the reverse approach. The primal state is presented as a natural, spontaneous, and peaceful existence. Government is then the source of humanity’s problems, not a solution.

Chapter nine of the Masters of Huainan (HZ) is entitled “The Ruler’s Techniques,” and its focus is on the methods that a ruler should use to create a humane and orderly government. The first, and certainly the most important technique, for a ruler is to act in wuwei. This does not mean the ruler should do absolutely nothing. It means that when he acts, nothing comes from him personally (HZ 9.23); that is, his policies are neither biased by his private preferences (HZ 9.25), nor are they restricted by the limits of his own vision for the state (HZ 9.9-9.11). Instead, his actions implement the movement of the Way (Dao) of Heaven (HZ 9.2).

The best form of government, the text suggests, is one where the ruler devotes himself to wuwei. Ideal rulers of the distant past such as Fuxi, Nuwa, and the Yellow Emperor (Huangdi) are described as being its adherents (HZ 6.7). By following their spontaneous natures and aligning themselves with profound wuwei, the world naturally became harmonious (HZ 8.5).

g. Zhu Xi on Law as the Enforcement of Morals

Zhu Xi shared Confucius’s distrust for the ability of law alone to bring order in society and to cultivate the people. He recognized that government and law were necessary, but considered them insufficient to bring about social order; virtue and ritual were still important. Virtue, law, rites, and punishments should complement each other. Zhu made an important contribution to Chinese political theory by insisting that governing by virtue and ritual was compatible with a system of laws and punishments, and he argued that Confucius’s protest against reliance on law was motivated by the context in which he lived, when some rulers made no use of virtue or the rites (li) at all.

In fact, Zhu Xi supported the use of law to assist in the moral education of the populace. The purpose of law was not merely to protect those in the society from harm or injury. It was also to shape the character of the society and its people. Accordingly, government not only had the right but also the obligation to engineer the morality of society and control what the people could do morally.

Nevertheless, Zhu Xi was aware of the long history of abuse of the power to make law, grant amnesties, and remit punishment practiced by Song dynastic rulers. He argued that laws must be clear and the enforcement of them must be just. He challenged directly the practice of amnesty (dashe) as frequently degenerating into a form of favoritism and injustice. By insisting on the enforcement of law and punishment of offenders, Zhu is often misunderstood as being akin to the worst abusers of law as found in the Legalist tradition. However, he was not advocating severity of punishment as a value in itself, but rather recommending the just administration of law as the active enforcement of morals, using politics as a means of moral cultivation.

h. Yan Fu (1854-1921): China Not Ready for Democracy

After the first Sino-Japanese War of 1894-1895, China entered into the period that one might call Modern Chinese Philosophy where there was an influx of texts and ideas from the Western world. Yan Fu became the most influential translator of Western works in China. In fact, Yan was not only the greatest authority on Western philosophy in China at the beginning of the 20^th century, but also he was the first scholar to systematically introduce Western philosophy by translating a significant number of works: Thomas Henry Huxley’s Evolution and Ethics (1893), published in Chinese in 1898; Adam Smith’s Wealth of Nations (1776), published in Chinese in 1902; Herbert Spencer’s The Study of Sociology (1872) and John Stuart Mill’s On Liberty (1859), both translations published in Chinese in 1904; Charles de la Secondat de Montesquieu’s The Spirit of the Laws (1748), John Stuart Mill’s A System of Logic (1843), translated in 1905, and William Stanley Jevon’s The Theory of Political Economy (1878), translated in 1909.

Yan was a true cultural intermediary who, at a critical moment in history, sought to make European works of philosophy and social science accessible to a Chinese readership. He put forward a form of Social Darwinism according to which social organization is also a product of evolution and subject to its same laws and processes.

He declares that both the Western powers and Japan, which had invaded and exploited China, were nevertheless morally and intellectually “superior.” In his view, China had become “inferior” as a result of its inability to excel in the international competition of worldviews, technology, and socio-political structures. He made his thought clear that in order for China to fare well in global competition with other nations it must alter its societal structure.

Yan claimed that the reason why China was weaker and less able to compete compared to the Western nations was its lack of liberty for its people. Yan not only accepted Mill’s view in On Liberty that the strength of a body politic lies in its commitment to the discovery of truth but he also recognized that liberty of expression and inquiry is essential to pursuit and recognition of truth. Accordingly, he extended the point to claim that liberty is essential in order to produce a strong nation. When people lack liberty, they will not be motivated to fight for the state or work hard in order to create a productive society.

Prior to Yan Fu, the concept of liberty that he was drawing from Mill does not mean doing whatever one wants. Society has genuine interests that might be harmed by indiscriminate freedom of action. Moreover, society has a right to transmit a set of values and cultural practices that can limit freedom of the individual. But Yan cautions that China’s highly structured moral beliefs and social rituals can overwhelm liberty if not properly watched.

To this point, there had not been any rigorous analysis of the nature and place of liberty in Chinese political philosophy. While some philosophical defenses of remonstrance with parents or rulers can be found in the history of Chinese philosophy, the function of this concept is much different than what Mill means by one’s individual free expression of lifestyle. Accordingly, when Chinese intellectuals began reading Mill and Yan’s commentaries on the translation, a new way of looking at society and a person’s place within it came into view.

Yan was forced to defend himself against conservative critics in China who felt the radicalism of a civil libertarian society represented danger and the possibility of chaos. His strategy was to claim that although society should not interfere with individual human liberty, neither should the individual do anything to harm society by his free expression. Accordingly, Yan’s translation of Mill’s work was published under the title, On the Borderline between Society’s and Individuals’ Power. In his comments on the book, Yan extended Mill’s “harm principle,” to include a legitimate political power to restrict freedom in the name of the protection of societal and communal integrity and value.

Nevertheless, Yan did not support China’s 1911 revolution to create a Republic and disestablish the Qing dynasty. Rather, he insisted on gradual political reform. He thought that improved education for the Chinese population was needed before the people would be ready to participate in government; the Chinese people at the turn of the 20^th century, Yan believed, were not yet ready for participatory government and responsible use of free expression.

i. Liang Qichao (1873-1929): Emergent Chinese Nationalism

Xiao Yang (17) calls Liang Qichao “the most widely read public intellectual during the transitional period from the late Qing dynasty (1644-1912) to the Republican era (1912-49)”. For Liang Qichao, the central task of philosophy is to perfect the principles and rules necessary for social affairs within a political system. He thought an authentic philosopher was not so much an ontologist or epistemologist as a jingshi; that is, a statesman or scholar who practices statesmanship.

Liang built his early political philosophy from 1896-1903 on the position that the myriad things of existence move continuously toward integration and grouping (qun). He read Thomas Huxley’s Evolution and Ethics, and he interpreted Huxley’s findings to mean that higher evolutionary development always took place when solidarity and group harmony became overriding intentions, whether in kinship lines, groups, tribes, or emergent human societies. This position led him to distinguish between the moral virtues that related to individual personal conduct (side) and civic or public virtues (gongde), which were necessary for the creation of a healthy and ideal society. Liang proposed to develop a modern Chinese political philosophy designed to produce what he called a “new citizenry” (xinmin) for China.

Liang took the Chinese term min (people), which was used to mark the people that made up a population, and replaced it with the concept guomin (citizens) in an intentional effort to tie individual identity and nationalism together. He believed a philosophically viable political body is not merely made up of a population. The people must be brought into being as citizens who express their powers and right to self-government, otherwise the nation itself ceases to exist and becomes something ultimately destructive to human flourishing.

In his essay, “On the Progress China has made in the Last Fifty Years” (1922), Liang held to two principles. The first, which he called “the spirit of nation building,” was “Anyone who is not Chinese has no right to govern Chinese affairs” (Liang 7). The second, known as “the spirit of democracy,” was “Anyone who is Chinese has the right to govern Chinese affairs” (Liang 4031).

j. Mao Zedong (1893-1976): The Sinification of Marxism

Marxist writings were introduced to China as part of the movement called “Western Learning” (Xi xue). The first reference to Western socialism seems to be in an essay by Yan Fu. However, Zhao Bizhen’s translation of Fukui Junzo’s Modern Socialism in 1903 was the first comprehensive introduction of Marxism into China. In 1912, a Chinese translation of Friedrich Engels’s Socialism: Utopian and Scientific was serialized in issues 1-7 of “The New World” (Xin Shijie). The Chinese version of Karl Marx and Friedrich Engels’ Communist Manifesto (Gongchangdang xuanyan) was translated by Chen Wangdao and published in April 1919. In 1931, Chen Qixiu translated Das Kapital, the fundamental text of Marxist economics.

While many Chinese intellectuals wrote on Marxism in the early part of the 20^th century, no thinker is as important to the sinification of Marxism as Mao Zedong. While some would question Mao’s credentials as a philosopher, he did, however, educate himself extensively on Chinese history and philosophy. His concerns were directed into a relatively narrow range of philosophical inquiry: social, political, and economic thought.

Mao thought that Marxism must be made to engage with the specific and particular situation of the Chinese people and culture. He held that Chinese Communists must learn how to apply the theories of Marxism-Leninism to concrete situations in China, enabling an application of Marxist philosophy that is uniquely Chinese in all circumstances.

Several factors are important to note about how and why Marxism assumed its particular form in China in the 1940s-1970s. Perhaps most important of these is that Chinese Marxism drew on the Chinese intellectual tradition in ways that minimized some of the difficulties that are found in Western Marxism. Long before the introduction of Marxist thought, Chinese philosophical history embraced the principles of the socio-economic significance to communal order, a humanistic non-religious worldview, dialectical social and intellectual processes, and authoritarian rule by an enlightened elite. When Marxism was rendered into the Chinese target language, its central concept of “dialectics” was translated as tongbian or “continuity through change,” a concept used historically in Chinese tradition.

Instead of using yin and yang to translate Marxist “dialectics,” Mao’s uses the terms mao (spear) and dun (shield), employing a famous story taken from the Hanfeizi. In the story a dialectical tension emerges when a man offers to sell both an invincible sword and an impenetrable shield. Mao uses this example to highlight the inevitability of the dynamic interaction of divergent views that contradict each other. For Mao, only actual political practice and societal change, not intellectual cognition or language, can fully overcome the tongbian (dialectics) of Chinese social and economic realities. Dialectics is not an academic exercise, but a revolutionary one.

In what became known as “Mao Zedong Thought,” Mao called for a “New Democracy” where power would be taken from the organizations and persons that had perpetuated China as a semi-colonial and semi-feudal society and given to a new revolutionary leadership of the people who would transform China into a new socialist state. He spoke of this change as a dictatorship of the revolutionary leadership and a democratic centralism. Rather than the product of universal suffrage, by “democratic” he meant “oriented toward the people” (Mao, June 30, 1949 and February 27, 1957).

The “New Democracy” would require a “New Economy” in which the new government would own the banks and industrial and commercial enterprises in order to prevent them from dominating the livelihood of the people (Mao February 27, 1957). Eventually, these political ideas found expression in the “iron rice bowl” (tie fanwan) and state-owned enterprises, as a vision of the ways government should pursue distributive justice.

k. Forms of Contemporary Confucian Political Theory

i. Tu Weiming

One thinker who is contributing to a renewal and revision of political theory by constructing a New Confucian social theory is Tu Weiming. In his Reinventing Confucianism: the New Confucian Movement, Umberto Bresciani names Tu as the leader of the “third generation” of New Confucians. Tu is Professor of Philosophy and founding Dean of the Institute for Advanced Humanistic Studies at Peking University.

Tu considers Confucianism to be an all-embracing humanism that merges the secular and sacred. He also believes that the Confucian moral ideal of the exemplary person can be realized more fully in a liberal democratic society than in either the traditional imperial monarchies or modern authoritarian regimes. Moreover, he argues that Confucianism adapted for the contemporary period is an antidote to the deficiencies of Western philosophy that gives insufficient importance to the idea of community and privileges the political ideals which tend to degenerate into injustice and disorder.

Tu closely associates ethics and politics, and argues that the work of political rectification envisioned by Confucius is one which monitors and constantly adjusts social processes of communal life in order to bring about a “fiduciary community” where each person is not merely permitted, but encouraged, to pursue moral self-cultivation.

Tu suggest that in the Confucian community divergent interests and plural desires are dealt with differently than in social contract and civil libertarian adversarial systems where the tyranny of the majority may be expressed in the ballot. In the fiduciary community, no decision by ruling authority can be regarded as appropriate if it destroys the ethos of trustworthiness among the people or between the people and the government. Such a delimitation of power creates in the community what Tu calls a “convergence of orientations”. The resulting benefit is a fiduciary community where citizens “exhort one another to do good” (bai xing quan) in a learning culture.

While he recognizes the immense value of Western enlightenment rationality, Tu insists that its tools and values must be supplemented by three requirements that can move humanity toward a global ethic: 1) converting from an anthropocentric to an anthropocosmic vision that appreciates the vibrancy of spirituality and removes man from being the measure of all things; 2) revising instrumental rational empiricism to include sympathy and empathy necessary for a full phenomenology of experience; and 3) instantiating the universalizable values of liberty, rationality, law, human rights, and the dignity of the individual.

ii. Jiang Qing

Jiang Qing is a contemporary Chinese Confucian who is best known for his criticism of New Confucianism as expressed by Tu Weiming and others. According to him, New Confucianism deviated from original Confucian principles and is overly influenced by Western liberal democracy. He also feels there is a drift in the Chinese Communist Party that seems unfocused and without direction. He proposes an alternative path for China called “Constitutional Confucianism” or “Political Confucianism.” Jiang believes that China’s ongoing political and social problems are to be solved by the revival of and commitment to what he considers to be authentic Confucianism.

To implement his changes, Jiang argues that Confucian materials should replace the Marxist curriculum taught in China’s universities and government party schools. He has been an advocate for new Confucian academies throughout the country, especially his own retreat called Yangming Spiritual House. He has been a central player in the “Reading of the Classics Movement” (dujing yundong), having edited a 12-volume school textbook titled The Fundamental Texts of Chinese Culture Classics for Reading (2004).

In A Confucian Constitutional Order, Jiang advocates what he calls “Humane Authority” as the guiding value of political process. His new model is expressed through a trilateral parliamentary framework made up of a House of Exemplary Persons that represents the sacred, a House of the Nation that represents historical and cultural legitimacy, and a House of the People that represents popular sentiment.

iii. Kang Xiaoguang

Kang Xiaoguang has taken up the challenge to offer a political philosophy for China’s post-Mao years in several works. An summary of his views in English is by David Ownby (2009). Kang calls for the Chinese Community Party to be Confucianized. He thinks Marxism should be replaced with a reconstituted and adapted philosophical system of Confucius and Mencius. He holds that while the educational system will keep the party schools, their syllabi should be changed, listing the Four Books and Five Classics as required texts. Kang desires a return to the examination system for all promotions in the Chinese bureaucracy and he wants Confucian philosophical teachings to be added to each examination. Moreover, he also maintains that the Chinese society as a whole should be Confucianized. Here the key is to introduce Confucianism into the national education system, adding courses in Chinese culture that Kang claims will impart a value system, a faith, and soul for the culture. In the long term, he thinks that the moral bearings of China can be rebalanced only if Confucianism becomes the state’s civil value system.

iv. Fan Ruiping

Fan Ruiping’s project is set out most clearly in his Reconstructionist Confucianism: Rethinking Morality after the West (2010). In this work, he calls for reclaiming and articulating resources from the Confucian tradition to address contemporary moral and public policy challenges. He sets his effort against both Western civil libertarian democracies and the New Confucianism of Tu Weiming and others. He holds that while Western social philosophy is founded on abstract and general principles, Confucianism is defined by specific rules that identify particular practices leading to a virtuous mode of life developed in the forge of a properly harmonious Confucian family. Fan argues that in such families persons learn how to treat others as unequals and gain mastery of the push and pull of graded love, creating a virtuous familism that is transferable to the society at large. Instead of Western language about rights, Fan holds that the goal in political policy is to treat persons as relatives and the nation and global community as a household drawing on the archetype of a traditional Chinese family that brought many persons into its circle of influence. Rather than norms such as “justice as fairness” (John Rawls), Fan characterizes the Confucian model as “justice as harmony.”

An important source in English for current debates about Confucian reconceptualization of Chinese politics in theory and practice is Fan’s collection of essays entitled The Renaissance of Confucianism in Contemporary China.

5. Epilogue

The history of Chinese philosophy may be approached in many ways. In this article, an overview of many important thinkers has been provided, and their contributions to world philosophy on the topics of ontology, epistemology, moral theory, and political philosophy were discussed. Another viable approach to the history of the tradition would be to demarcate the moves made in Chinese philosophical thought within the flow of Chinese history itself, giving attention also to the interactions between Chinese thinkers and internal dialogues of significance. Both of these approaches can contribute to an appreciation of the significance and value of philosophy and the important place Chinese philosophers play within it.

6. References and Further Reading

Ames, Roger. The Art of Rulership: A Study of Ancient Chinese Political Thought. Albany: SUNY Press, 1994.
Angle, Stephen. Contemporary Confucian Political Philosophy: Toward Progressive Confucianism. Malen, MA: Polity Press, 2012.
Bell, Daniel, and Fan Ruiping, eds. A Confucian Constitutional Order: How China’s Ancient Past Can Shape Its Political Future. Princeton: Princeton UP, 2012.
Bernal, Martin. Chinese Socialism to 1907. Ithaca, NY: Cornell UP, 1976.
Bersciani, Umberto. Reinventing Confucianism: The New Confucian Movement. Taipei: Taipei Ricci Institute for Chinese Studies, 2001.
Bishop, Donald, ed. Chinese Thought: An Introduction. Delhi: Motilal Banarsidass, 1985.
Blakeley, Donald. “The Lure of the Transcendent in Zhu Xi.” History of Philosophy Quarterly 21.3 (2004): 223-40.
Briere, O. Fifty Years of Chinese Philosophy: 1848-1948. Trans. Lawrence Thompson. New York: Praeger, 1965.
Bruce, Percy. Chu Hsi and His Masters: An Introduction to Chu Hsi and the Sung School of Chinese Philosophy. London: Probsthain & Co., 1923.
Carr, Karen, and Philip J. Ivanhoe. The Sense of Antirationalism: The Religious Thought of Zhuangzi and Kierkegaard. New York: Seven Bridges, 2000.
Chan, Wing-tsit, trans. A Sourcebook in Chinese Philosophy. Princeton: Princeton UP, 1963.
Chan, Wing-tsit, trans. Instructions for Practical Living and Other Neo-Confucian Writings by Wang Yang-Ming. New York: Columbia UP, 1963.
Chang, Carsun. The Development of Neo-Confucian Thought. 2 vols. New York: Bookman Associates, 1957-1962.
Chin, Ann-ping, and Mansfield Freeman. Tai Cheng on Mencius: Explorations in Words and Meaning. New Haven: Yale UP, 1990.
Ching, Julia. To Acquire Wisdom: The Way of Wang Yang-ming. New York: Columbia UP, 1976.
Chou, Chih-P’ing, ed. Collection of Hu Shih’s English Writings. 3 vols. Heidelberg: Foreign Language Teaching and Research Publishing Co., 1995.
Chung-ying, Cheng, and Nicholas Bunnin eds. Contemporary Chinese Philosophy. Malden, MA: Blackwell, 2002.
Cua, Antonio. The Unity of Knowledge and Action: A Study in Wang Yang-ming’s Moral Psychology. Honolulu: U Hawaii Press, 1982.
Cua, Antonio. ed. Encyclopedia of Chinese Philosophy. New York: Routledge, 2003.
De Bary, Theodore, Irene Bloom, and Joseph Adler eds. Sources of the Chinese Tradition: From Earliest Times to 1600 2nd ed. Vol. 1. New York: Columbia UP, 2000.
De Bary, Theodore. The Unfolding of Neo-Confucianism. New York: Columbia UP, 1975.
Fan, Ruiping. Reconstructionist Confucianism: Rethinking Morality after the West. Heidelberg: Springer, 2010.
Fan, Ruiping. ed. The Renaissance of Confucianism in Contemporary China. Heidelberg: Springer, 2011.
Feng, Youlan. A Short History of Chinese Philosophy. Trans. Derk Bodde. New York: Free Press, 1948.
Feng, Youlan. A History of Chinese Philosophy, 2 vols. Trans. Derk Bodde. Princeton: Princeton UP, 1952-1953.
Forke, Alfred, trans. Philosophical Essays of Wang Ch’ung. London: Luza, 1907. Complete text available at Hathi Trust Digital Library.
Fraser, Chris. “Knowledge and Error in Early Chinese Thought,” Dao: A Journal of Comparative Philosophy 10.2 (2011): 127–48.
Fraser, Chris. “The Mohist Conception of Reality.” Chinese Metaphysics and its Problems. Eds. Chenyang Li and Franklin Perkins. Cambridge: Cambridge UP, 2014.
Gardner, Daniel. Learning to be a Sage: Selections from the Conversations of Master Chu, Arranged Topically. Berkeley: U California Press, 1990.
Graham, A.C. Studies in Chinese Philosophy and Philosophical Literature. Singapore: Institute of East Asian Philosophies, 1986.
Graham, A.C. Yin-Yang and the Nature of Correlative Thinking. IEAP Occasional Paper and Monograph Series, No. 6. Singapore: Institute of East Asian Philosophies, 1986.
Graham, A.C. Disputers of the Tao: Philosophical Argument in Ancient China. LaSalle, IL: Open Court, 1989.
Hsiao, Kung-chuan. A History of Chinese Political Thought, vol. 1. Trans. F.W. Mote. Princeton: Princeton UP, 1979.
Hsu, Sung-Peng. “Hu Shih.” Chinese Thought: An Introduction. Ed. Donald Bishop. Delhi: Motilal Banarsidass Publishers, 1995: 364-91.
Hu Shi. The Development of the Logical Method in Ancient China. Shanghai: The Oriental Book Co., 1928.
Hu Shi. “My Credo and Its Evolution.” Living Philosophies: A Series of Intimate Credos. New York: Simon and Schuster, 1931: 235-63.
Hu, Xinhe. “Hu Shi’s Enlightenment Philosophy.” Contemporary Chinese Philosophy. Eds. Chung-ying Cheng and Nicholas Bunnin. Oxford: Blackwell Publishing, 2002: 82-102.
Hutton, Eric, trans. and ed. Xunzi: The Complete Text. Princeton: Princeton UP, 2014.
Ivanhoe, Philip J., and Bryan W. Van Norden, eds. Readings in Classical Chinese Philosophy 2nd ed. Indianapolis: Hackett Publishing, 2006.
Jiang, Qing. A Confucian Constitutional Order: How China’s Ancient Past Can Shape Its Political Future. Princeton: Princeton UP, 2013.
Jiang, Xinyan. “Enlightenment Movement.” History of Chinese Philosophy. Ed. Bo Mou. New York: Routledge, 2009: 473-511.
Johnston, Ian, trans. The Mozi: A Complete Translation. New York: Columbia UP, 2010.
Knight, Nick. Rethinking Mao: Explorations in Mao Zedong’s Thought. Lanham, MD: Lexington Books, 2007.
Lai, Karyn. An Introduction to Chinese Philosophy. Cambridge: Cambridge UP, 2008.
Lau, D.C., trans. Mencius. New York: Penguin Books, 2003. A revision of Lau’s landmark 1970 edition.
Liang, Qichao. Liang Qichao Quanji (The Collected Works of Liang Qichao). 10 vols. Beijing: Beijing Publishing House, 1999.
Liao, W.K., trans. Complete Works of Hanfeizi. London: Arthur Probsthain, 1939.
Littlejohn, Ronnie. An Introduction to Chinese Philosophy. London: I.B. Tauris, 2015.
Liu, JeeLoo. An Introduction to Chinese Philosophy: From Ancient Philosophy to Chinese Buddhism. Malden, MA: Blackwell, 2006.
Lowe, Scott. Mo Tzu’s Religious Blueprint for a Chinese Utopia: the Will and the Way. Lewiston, NY: The Edwin Mellen Press, 1992.
Machle, Edward. Nature and Heaven in the Xunzi. Albany: SUNY Press, 1993.
Major, John. Heaven and Earth in Early Han Thought. Albany: SUNY Press, 1993.
Major, John, Sarah Queen, Andrew Meyer, and Harold Roth, trans. The Huainanzi: A Guide to the Theory and Practice of Government in Early Han China. New York: Columbia UP, 2010.
Makeham, John, ed. New Confucianism: A Critical Examination. New York: Palgrave Macmillan, 2003.
Makeham, John. Dao Companion to Neo-Confucian Philosophy. New York: Springer, 2010.
Mao, Zedong (1917-45). Collected Works of Mao Zedong. U.S. Government’s Joint Publications Research Service.
Mei, Yi-Pao. “The Kung-sun Lung Tzu with a Translation into English.” Harvard Journal of Asiatic Studies 16 (1953): 404-37.
Mou, Bo, ed. Comparative Approaches to Chinese Philosophy. Aldershot, UK: Ashgate 2003.
Mou, Bo, ed. A History of Chinese Philosophy. New York: Routledge, 2009.
Mou, Zongsan. “Metaphysical Mind and Metaphysical Nature (Xinti yu xingti).” Mou Zongsan’s Complete Works. Vol. 5. Taipei: Lianjing, 1968
Nylan, Michael. “Wang Chong.” Encyclopedia of Chinese Philosophy. Ed. Antonio S. Cua. New York: Routledge, 2003: 745-48.
Ownby, David. “Kang Xiaoguang: Social Science, Civil Society, and Confucian Religion.” China Perspectives 80 (2009): 101-111.
Peerenboom, Randall P. Law and Morality in Ancient China: The Silk Manuscripts of Huang-Lao. Albany: SUNY Press, 1993.
Robins, Dan. “Xunzi.” The Stanford Encyclopedia of Philosophy. Ed. Edward Zalta. 2007.
Rochat de la Vallee, Elisabeth. Wuxing: The Five Elements in Classical Chinese Texts. London: Monkey Press, 2009.
Rutt, Richard, trans. Zhouyi: The Book of Changes. New York: RoutledgeCurzon, 1996.
Shaughnessy, Edward, trans. The I Ching: The Classic of Changes. New York: Ballatine Books, 1997.
Shen, Vincent, ed. Dao Companion to Classical Confucian Philosophy. New York: Springer, 2013.
Slingerland, Edward. Effortless Action: Wuwei as Conceptual Metaphor and Spiritual Ideal in Early China. New York: Oxford UP, 2003.
Tan, Chester C. Chinese Political Thought in the Twentieth Century. Garden City, NY: Anchor Books, 1971.
Tsukamoto, Zenryu. History of Early Chinese Buddhism: From Its Introduction to the Death of Hui-Yuan. Trans. Leon Hurvitz. Tokyo: Kodansha International, 1985
Watson, Burton, trans. Hsun Tzu: Basic Writings. New York: Columbia UP, 1963.
Watson, Burton, trans. Han Fei Tzu: Basic Writings. New York: Columbia UP, 1964.
Watson, Burton, trans. The Complete Works of Chuang-Tzu. New York: Columbia UP, 1968.
Watson, Burton, trans. The Essential Lotus: Selections from the Lotus Sutra. New York: Columbia UP, 2002.
Wilson, Thomas. Genealogy of the Way: The Construction and Uses of the Confucian Tradition in Late Imperial China. Palo Alto, CA: Stanford UP, 1995.
Xiao, Yang. “Liang Qichao’s Political and Social Philosophy.” Contemporary Chinese Philosophy. Eds. Chung-ying Cheng and Nicholas Bunnin. Oxford: Blackwell Publishing, 2002: 17-36.
Zhang, Dainian. Key Concepts in Chinese Philosophy. New Haven: Yale UP, 2002.
Zhang, Dongsun. “A Chinese Philosopher’s Theory of Knowledge,” The Yenching Journal of Social Studies, 1.2 (1939), reprinted in S.I. Hayakawa (ed.), Our Language and Our World: Selections from Etc.: A Review of General Semantics. New York: Harper, 1959. 299-323.
Zhou, Xiaoliang. “Woguo xifang zhexue yanjiu de huigu xianzhuang shuping he zhanwang [The Studies of Western Philosophy in China: Historical Review, Present States and Prospects].” Paper presented at the Chinese Academy of Social Sciences, (July 2007).
Zhurcher, Erik and Stephen F. Teiser. The Buddhist Conquest of China: The Spread and Adaptation of Buddhism in Early Medieval China. Leiden: E.J. Brill, 1972.

Author Information

Ronnie Littlejohn
Email: ronnie.littlejohn@belmont.edu
Belmont University
U. S. A.

Margaret Cavendish (1623—1673)

Margaret Lucas Cavendish, the Duchess of Newcastle, was a philosopher, poet, playwright and essayist. Her philosophical writings were concerned mostly with issues of metaphysics and natural philosophy, but also extended to social and political concerns. Like Hobbes and Descartes, she rejected what she took to be the occult explanations of the Scholastics. Against Descartes, however, she rejected dualism and incorporeal substance of any kind. Against Hobbes, on the other hand, she argued for a vitalist materialism, according to which all things in nature were composed of self-moving, animate matter. Specifically, she argued that the variety and orderliness of natural phenomena cannot be explained by blind mechanism and atomism, but instead require the parts of nature to move themselves in regular ways, according to their distinctive motions. And in order to explain that, she argued for panpsychism, the view that all things in nature possess minds or mental properties. Indeed, she even argued that all bodies, including tables and chairs, as well as parts of the bodies of organisms, such as the human heart or liver, know their own distinctive motions and are thereby able to carry it out. These different parts of nature, each knowing and executing their distinctive motions, create and explain the harmonious and varied order of it. In several ways, Cavendish can be seen as one of the first philosophers to take up several interesting positions against the mechanism of the modern scientific worldview of her time. Thus it is possible to add that she presages thinkers such as Spinoza and Leibniz.

When she turned to discuss political and social issues, Cavendish’s metaphysical commitments seem to remain. Cavendish was a staunch royalist and aristocrat; perhaps not surprisingly, then, she argued that each person in society has a particular place and distinctive activity and that, furthermore, social harmony only arises when people know their proper places and perform their defining actions. She was therefore critical of social mobility and unfettered political liberty, seeing them as a threat to the order and harmony of the state. Even so, her writings also contain nuanced and complex discussions of gender and religion, among a variety of other topics.

Despite her conservative political tendencies, Cavendish herself can be seen as a model for later women writers. She wrote dozens of books, at least five of which alone were on natural philosophy, under her own name, a feat which may make her the most published female author of the seventeenth century and one of the most prolific women philosophers in the early modern period. In addition to writing much on natural philosophy, she wrote on a dizzying array of other topics and, perhaps most impressively, in a wide range of genres. Her philosophically informed poetry, plays, letters and essays are at times as philosophically valuable as her treatises of natural philosophy.

Life and Works
Natural Philosophy
Political Philosophy
References and Further Reading

1. Life and Works

Margaret Lucas was born in 1623 in Colchester into a family of aristocrats and staunch royalists. She received little formal education, being tutored at home with her seven siblings, of which she was the youngest. She reports having spent much time in conversation with one of her brothers, John, who considered himself a scholar and who would become a founding member of the Royal Society. She joined the Queen’s court and served as a maid to Queen Henrietta Maria, following her into exile in 1644, during the English Civil War. While in exile she met William Cavendish, then Marquess and later Duke of Newcastle. They were married in 1645.

While in exile in Paris and Antwerp, she reports discussing philosophy and natural science with her husband and his younger brother, Sir Charles Cavendish, who held a regular salon attended by Thomas Hobbes, Kenelm Digby and occasionally René Descartes, Marin Mersenne and Pierre Gassendi. Margaret herself reports having attended several dinners, at which these philosophers were present, though she denies having spoken to them about any, but the most superficial of matters.

While her husband remained in exile, she returned in 1651 and again in 1653 to England. This was during the reign of Commonwealth, during which her husband, were he to have returned, would have had to renounce his royalism and swear fealty to the Commonwealth, as was required by the republican parliament of the time. The parliament did not extend that requirement to women, claiming that women were not capable of such political acts. Thus Margaret was allowed to return to England without swearing fealty to the Commonwealth.

During her 1653 visit, she arranged for the publication of her first collection of writings, Poems and Fancies and Philosophical Fancies. She reports having delivered the second philosophical treatise a few days too late to have it included with the first in a single publication, which had been her original intention. The publisher was Martin and Allestyre, at the Bell in St. Paul’s Churchyard, which was a well-regarded publisher, who later became the official publisher for the Royal Society. It is truly remarkable that she was able to secure their publication, as few women published philosophy in England in the seventeenth century, much less under their own name and while in exile.

The same publishing house would publish The World’s Olio and Philosophical and Physical Opinions in 1655 and Nature’s Pictures in 1656. The second work of 1655, Philosophical and Physical Opinions, contained five parts and 210 chapters, the first part of which, consisting of 58 chapters, was in fact a reprinting of her earlier Philosophical Fancies. With her 1655 Philosophical and Physical Opinions, she added a number of epistles and her “Condemning Treatise on Atoms” to the front matter and also extended the work beyond the earlier Philosophical Fancies significantly.

With the Restoration of Charles II to the throne, she returned to England with her husband and continued to write. In addition to publishing on natural philosophy, she also wrote essays on a remarkable variety of other topics, including the nature of poetry, the proper way to hold a feast, fame, women’s roles in society and many others. She also wrote many plays and poems, as well as a fantastic utopia, The Description of a New World, Called the Blazing World in 1668.

There may have been some controversy over a woman publishing works on natural philosophy, as she felt the need to include several epistles, both from herself and from her husband and brother-in-law, attesting to the fact that she had written these works herself. Indeed, she returns to defend herself as an author and natural philosopher at a number of different places in her work, often in epistles to the reader. She also defends the propriety of her being so bold as to write in her own name and to think her thoughts worthy of publication. Her several discussions of fame are worth noting in this context.

She continued to write on natural philosophy, among other topics, to growing attention. She sent her works to many of the well-known philosophers then operating in England, as well as to the faculties at Cambridge and Oxford. Indeed, after she had published her most famous work of natural philosophy, Observations Upon Experimental Philosophy in 1666, she was invited to attend a meeting of the Royal Society, a privilege rarely granted to women at the time.

In all, she may be the most prolific woman writer of early modern Europe and certainly the most prolific woman philosopher. Depending on how one counts, she published over a dozen and perhaps as many as twenty works, at least five of which are works on natural philosophy and many more contain essays with substantive philosophical content.

2. Natural Philosophy

Cavendish wrote half a dozen of works on natural philosophy. Indeed, natural philosophy constituted the largest part of her philosophical output and a large part of her writing as a whole. Her philosophical commitments can be described as materialist, vitalist and panpsychist. In what follows, her philosophical discussions will be grouped around several recurring themes and arguments.

a. Materialism

Like Hobbes, Descartes or Bacon, Cavendish regularly motivates her position by attacking the Aristotelianism of the schools, mocking those whom her husband calls the “gown-tribe.” She criticized what she took to be their commitment to occult powers and incorporeal beings in nature and offers her materialism as an alternative. She explains that her intent is to provide a philosophical system accessible to all, without special training. From her earliest work, Philosophical Fancies, published in 1653, Cavendish argued for materialism in nature. In the first two chapters of that work, which she reprinted in Philosophical and Physical Opinions in 1655, she claims that nature is one infinite material thing, which she sometimes describes as “the substance of infinite matter” (“Condemning Treatise of Atomes”). This infinite material substance is composed of an infinite number of material parts, with infinite degrees of motion. Similarly, this motion is all of the same kind, differing from instance to instance only in swiftness or direction. In other words, the natural world is entirely constituted by a single type of stuff, which she calls matter and a single force, which she calls motion. She distinguishes the objects and events in nature from one another by the varying parts of matter, bearing different motions, within that one infinite material substance. She explicitly extends this materialist doctrine to the human mind in chapter 2 of the Philosophical Fancies, where she says that the forms of the gown-tribe, as well as human minds, are nothing but “matter moving, or matter moved.” Furthermore, she remained committed to this materialism throughout her career, such as in her Observations Upon Experimental Philosophy first published in 1666, claiming that all actions of sense or of reason are corporeal. Thus we see from the very beginning of her first work that she is a materialist.

The exact nature of her materialism develops over time, however. In her earliest work from 1653, she allows for an atomist account of nature and matter, though by 1656 she is already arguing against atomism in her “Condemning Treatise of Atomes”. Later, in her Observations from 1666, she provides at least two arguments against atomism. First, she argues that the concept of an extended yet indivisible body is incoherent, saying, “whatsoever has body, or is material, has quantity; and what has quantity, is divisible” (Ch. 31, 125); this is an argument that was commonly employed against atomism in the seventeenth century. She also argues that composite bodies, each with their own motions, could not account for the unity of the complex body, but would instead be like a swarm of bees or a school of fish. Atomism, she argues, cannot explain organic unity. She says, “[w]herefore, if there should be a composition of atoms, it would not be a body made of parts, but of so many whole and entire single bodies, meeting together as a swarm of bees…and the concourse of them would rather cause a confusion, than a conformity in nature” (Ch. 31, 129). Instead of atomism, Cavendish proposes that matter is both infinite in extension and always further divisible. Furthermore, for Cavendish, complex beings such as animals are composed of distinctive matter in motion, which she takes to provide them with their unity. Even so, her primary targets are not atomist materialism, as much as both the occultism of the Schools and the mechanism of some of her contemporaries.

She also applies her materialism to the human mind. In her early works, she suggests that there is nothing of the human being that is not material. For example, in her first work, she wrote a brief dialogue between body and mind, in which she claims that the only way the mind can attain any sort of life after the death of the body is by fame, that is, by being thought well of by others. Indeed, she elsewhere claims that “all the actions of sense and reason…are corporeal” and “sense and reason are the same in all creatures and all parts of nature” (Ch. 31, 128), as well as, “knowledge, being material, consists of parts” (Ch. 37, 160).

Cavendish seems to qualify her materialism with regard to the human soul later in her career, when she clarifies that her previously strong and consistent commitment to materialism only applies to the natural world. For example, in Observations, she claims that humans have both a material mind and, in addition, a supernatural, immaterial soul. She argues that the way, in which this supernatural soul is related to the material mind and body is itself supernatural. After all, she suggests, place is a property belonging only to bodies and thus, could not belong to an immaterial soul. Therefore, the way, in which the immaterial soul is related to the material person is itself a supernatural, that is, miraculous phenomenon. Unfortunately, she offers little explanation for this immaterial soul and refrains from explaining whether or how the immortal soul might interact at all with anything in nature, instead implying that it does not. To make matters even more confusing, she seems to amend her view in 1668 when claiming that only God is immaterial and all other things are material. It may be that she had changed her mind as to whether or not human beings have immaterial, supernatural souls, but the texts themselves do not seem to speak definitively.

Throughout her work, however, Cavendish did claim that human beings possess a material soul. She explains the material, natural soul in the same way, in which she explains the mind, through her distinction among the different degrees of motion in matter, as mentioned above. Briefly, she claims that matter may have differing degrees of motion, such that some matter is relatively inert and gross, that is, being composed of larger pieces of matter, which she sometimes calls “dull matter”. In contrast, there is also a finer and more rare matter, which possesses more motion. This faster and lighter matter infuses dull matter. The natural, material, human soul or mind, she explains, is the finer, rarer matter within our grosser, cruder material bodies. Scholars have noted the similarity this view bears to Stoic doctrine, in that the rarer, more quickly moving matter resembles the Stoic pneuma.

Just like the Stoics, she also explicitly states in her later works—and suggests at times in her earlier works—that all bodies are completely infused with varying degrees of this active matter. Indeed, it is this matter that accounts for the regularity of natural phenomena across all of nature. She says that “there can be no order, method or harmony, especially such as appears in the actions of nature, without there be reason to cause that order and harmony” (Ch 6, 207). She claims, for example, that animals possess motions visible externally, such as jumping or running, whereas vegetables and minerals possess and exhibit motions only detectable internally, such as contracting or dilating. She refers to the motions found in animals, vegetables and minerals to varying degrees as sensitive spirits, a term that calls to mind Descartes’ animal spirits. But even minerals and vegetables and also animals and humans possess a further, yet finer and more quickly moving form of matter, which she calls “rational spirits.” These rational spirits are the quickly moving, but rare pneuma-like matter described above, which ultimately explain the various motions and behaviors of the natural objects. Ultimately, though, these motions and the matter they infuse are of the same fundamental kind, differing only in their degree of motion. This view, coupled with her radical claims that “all motion is life” and “knowledge is motion” will lead to her vitalism and panpsychism.

Another of Cavendish’s distinctive commitments about the nature of matter is this: matter bears an infinite degree of motion and, crucially, it bears that motion eternally. In other words, if a bit of matter has a certain degree of motion, according to Cavendish, it cannot lose that degree of motion nor communicate it to another piece of matter. We might say that, for Cavendish, the particular degree of motion that a part of matter bears is essential to that part. Thus, the cruder and grosser matter that bears a lesser degree of matter does so by its nature and cannot lose or gain a degree of motion. Similarly, the more quickly moving, finer parts of matter also bear their greater degree of motion by nature and cannot gain, lose or communicate the motion either. This view is related to another major theme of Cavendish’s work, one that we might call vitalism.

b. Vitalism and the Variability Argument

In addition to her commitment to materialism, Cavendish took pains to reject a position that was often associated with materialism in the seventeenth century, namely that of mechanism. Mechanism can be understood as the view that the natural world, as well as human beings, are made up of uniform material components that interact according to laws of motion and collision. One statement of this view, with which Cavendish was familiar, can be found in the opening chapters of Thomas Hobbes’ Leviathan. René Descartes, too, provided a mechanistic account of the natural world—apart from his commitment to the existence of the immaterial souls of human beings, of course.

Cavendish argued that mechanism could not be an accurate account of the natural world, because it could not properly explain the world that we observe. She claimed that two notable features of the natural world are variety and orderliness. The world around us is full of a vast array of different sorts of creatures and things, each performing distinctive activities or bearing distinct properties. Despite the natural world’s plentitude, it was also orderly. If we understand the nature of a particular creature or substance, we could predict successfully how it might behave or react to certain stimuli. Cavendish reasoned that if the world was ultimately constituted by uniform matter, passively receiving and transferring motion, according to mathematical laws of collision, then the universe should be either entirely homogenous or entirely chaotic. In other words, if passive, uniform matter communicating motion was really all we had to explain nature, we would not be able to account for its variety and orderliness—it would lack one or the other.

Instead, she claimed, different parts of the infinite material substance bear different degrees of motion by nature. They cannot directly transfer motion from one body to another, since motion is a property of the body that possesses it and not as something that can exist apart from its body. Thus individual bodies cannot give or receive their motions. Hence, the phenomena we observe are not to be explained by reference to uniform pieces of matter exchanging motion via collision. Rather, she explains, what we see is like a dance, in which each body moves according to its own, distinctive, internal principle, such that a pattern might be created by the dancers on the dance floor. She explicitly offers this dance metaphor in her first work of 1653 and again in 1655. For example, when she explains perception, she claims that the rational spirits flow in and out of the body through the eyes and touch upon the object being perceived, intermixing with the rational spirits found therein. The object, possessing its own distinctive spirits and motions, dances a pattern before the rational spirits, which flow back into the eyes. These rational spirits then take up the dance themselves, flowing back into the brain and continuing the dance, which she takes to be sufficient for the mind’s perceiving the object in virtue of the mind’s containing the distinctive dance or pattern. In these early works, she further explains that the rational spirits copy these dances based on a “natural sympathy” among adjacent bodies, particularly between the rational spirits of the perceiver and object perceived. Note that, throughout this account of perception, motion is never transferred from one body to another. Instead, motions and “dances” are taken up from the internal activity of the rational spirits, that is, from the nature of the moving matter. The matter moves itself according to its own nature and initiates changes in its own motion via natural sympathy.

By the 1660s, though, she largely replaces the dance metaphor with the terms “imitation” and “figuring out”, the latter in the sense of tracing or copying a shape or distinctive pattern of motion. Even so, the account is largely the same. Her argument from the Observations could be reconstructed as follows:

Bodies move in orderly and infinitely variable ways.
Either they are moved by spirits or they are moved by bodies.
But not spirits because that is mysterious, so bodies.
If bodily motion issues from the body, then, it must issue from either inanimate matter (mechanism) or animate matter (vitalism).
But not inanimate matter (mechanism), for the mechanistic account of bodily motion, (such as animals spirits and inanimate fine particles that transmit force), cannot account for the infinite variety and orderliness of the activity in nature.
So the bodily cause of motion must be the body’s animate matter, which (it is alleged) has an ability to produce an infinite variety of orderly effects.

This is what might be called the argument from the variability and regularity of nature for self-moving matter. Premise 5 implies the argument that if the world was ultimately constituted by uniform matter, passively receiving and transferring motion, according to mathematical laws of collision, then the universe should be either entirely homogenous or entirely chaotic. In this argument for self-moving matter, many of the central themes of Cavendish’s natural philosophy are visible: her materialist rejection of incorporeal causes, her denial of mechanistic explanation and her resulting vitalism.

Another significant feature of her natural philosophy, and one that appears especially clearly when she critiques mechanism, is her refusal to take mathematical physics as an exemplar. Whereas Cartesian and Hobbesian natural philosophy could be described as attempts to understand nature with metaphors and modes of explanation taken from the new, mathematical physics, Cavendish instead draws from other sources, especially her personal experiences with country life and, less directly, the life sciences. When explaining natural phenomena, she often makes reference to the behaviors of animals and humans, as well as her awareness of botanical phenomena. She in fact reported in the 1650s that Gerald’s Herbal, a botanical reference book, was the only scientific work she had read. Perhaps because of this, she often explained the behaviors of an animal’s or plant’s rational spirits in terms of their macro-level behaviors, rather than in terms of atomic or corpuscular, mathematical explanation. By the 1660s, at least, we know that she had read and engaged the work of other vitalist and anti-mechanists, such as the alchemist Johannes Baptista Van Helmont. However, even before that time, her preference for biological metaphors over those of mathematical physics was evident.

Cavendish’s preference for biological modes of explanation can also be seen in her organicism. Not only does she deny atomism, but she also argues that the parts of bodies in part possess their distinctive motions and natures in virtue of the larger, organic systems, in which they are located. She says, “[f]or example: an eye, although it be composed of parts, and has a whole and perfect figure, yet it is but part of the head, and could not subsist without it” (Observations, Ch. 31). This is not an argument for organicism; instead, she means it as an analogy to illustrate her views on individuals more generally.

Despite the similarities of her vitalism to that of Van Helmont or perhaps Henry More, Cavendish also departs from them in her commitment to materialism. Indeed, she accounts for life in nature by claiming that “[a]ll motion is life,” even in her first work of 1653. Human beings are alive, she says, because they are material beings composed of matter with varying degrees of motion moving in a distinctive pattern. For Cavendish that is all that is needed for something to be alive. Note, though, that all things in nature, from humans and animals and plants down to minerals and artifacts, are the things they are, because they are composed of matter with distinctive patterns and degrees of motion. In this regard, she resembles Hobbes, even though she will ultimately reject his mechanistic view of matter, especially with her view that all matter is self-moving. We might therefore say that Cavendish’s natural philosophy is committed to pan-vitalism or animism, or even, as Cudworth would later say, hylozoism. But we must remember that her view departs from the Cambridge Platonists and Van Helmont in denying that the principles of life are to be explained by reference to incorporeal powers, entities or properties. All matter is to some extent alive and all of nature is infused with a principle of life, but this principle of life is simply motion.

Thus Cavendish provides a fairly deflationary account of life as motion and in this regard her natural philosophy may resemble Hobbes or Descartes. Despite this similarity, Cavendish again rejects their mechanism in her denial of determinism, even with regards to bodily interaction. Though she often appeals to the orderliness and regularity of nature in defending her theory of self-moving matter, she also recognizes the presence of disorder in nature, such as in disease. In fact, she explains illness or disease as the rebellion of a part of the body against the whole, explaining that some bits of matter have freely chosen alternative motions and thus disrupted the harmonious all. In short, Cavendish ascribes a libertarian freedom not only to human agents but even to the parts of matter themselves, explaining the behaviors of organisms with a social ‘body politic’ metaphor. We might say, then, that she draws from experiences of the biological and botanical world to explain her metaphysics, but she also incorporates a Hobbesian sense of the body politic into her metaphysics and in so doing reinforces her rejection of the mechanistic worldview.

However, Cavendish does not stop at explaining the principle of life by reference to degrees of motion in matter, because she also claims to explain mental representation and ultimately knowledge in this way. When a particular pattern of motion occurs in the brain, say, via perception, the person perceives the object; for the person to have an idea of the object is just for her brain to contain its distinctive motion. More generally, she takes the presence of such patterned motions in matter to mean that said matter has knowledge, at least in some sense. Yet she also argues that such motions can be found throughout all of nature, every body possessing its own distinctive motions. For these reasons, her vitalist materialism fits nicely with her panpsychism.

c. Panpsychism

In saying that all motion is life and that all things in nature are composed of matter with a degree of motion, Cavendish affirms that life permeates all of the natural world, including what we might call inanimate objects. For Cavendish, inanimate objects are alive, because they possess motion, though they might have a lesser degree of motion, and thus a lesser degree of life, than an animal or human being. Indeed, she also believes that knowledge is similarly diffused across all of nature to greater and lesser degrees. For these reasons, we might call Cavendish an incremental naturalist with regard to knowledge and life. That is, she takes distinctively human traits such as knowledge and life to be natural properties that are present to varying degrees throughout all of nature.

Throughout her work, Cavendish argues that whatever has motion has knowledge and that knowledge is innate or internally directed motion. In her Philosophical Fancies of 1653, she explains that

the touch of the heel, or any part of the body else, is the like motion, as the thought thereof in the head; the one is the motion of the sensitive spirits, the other in the rational spirits, as touch from the sensitive spirits, for thought is only a strong touch, and touch a weak thought. So sense is a weak knowledge, and knowledge a strong sense, made by the degrees of the spirits (Chapter 45).

In the next chapter she continues to argue that all matter exhibits regular motion, which occurs because all matter is infused with sensitive spirits; but to have sensitive spirits is to be able to sense; thus all matter senses things.

Now, in her earliest work, she offers at best a “who knows so why not” sort of argument that matter thinks, saying, “[i]f so, who knows, but vegetables and minerals may have some of those rational spirits, which is a mind or soul in them, as well as man?” and “if their [vegetables and minerals] knowledge be not the same knowledge, but different from the knowledge of animals, by reason of their different figures, made by other kind of motion on other tempered matter, yet it is knowledge” (Chapter 46).

Later, for example in her Observations, she argues that the regularity of nature can best—or perhaps only—be explained by admitting that all material bodies possess knowledge. She argues that matter and material beings exhibit regular motion and then argues that “there can be no regular motion without knowledge, sense, and reason” (Observations, 129). Furhtermore, she argues that each part of the body and each object in nature exhibits a distinctive activity. The brain thinks; the stomach digests; the loins produce offspring—and they do so in regular and consistent ways. Indeed, each of these organs or parts of the body are themselves also composite, made up of an infinite number of smaller bodies. What unites them, however, is their distinctive motions, producing their distinctive behaviors. And Cavendish takes each of these distinctive motions to be a kind of knowledge.

She argues that we ought to think of these distinctive motions as knowledge, because that is the best, or perhaps only, way to explain the regularity and stability of these composites. If these parts are to do these things, they must know what they do, especially given the regular and consistent ways in which they do them. Indeed, without matter knowing its own distinctive motions, she argues, perception would be impossible. She says, “[s]elf-knowledge is the ground, or fundamental cause of perception: for were there not self-knowledge, there could not be perception” (Observations, 155). In short, all material entities, which is to say all things in nature, possess knowledge. The view that all things in nature possess mind or mental properties is panpsychism, to which Cavendish is committed here.

Even so, she uses the concept of knowledge in an unusual way. When she ascribes knowledge to a rock, or to my liver for example, but she neither necessarily means that the rock or my liver have mental states like ours nor that they can perceive their environments in the same way we do. For Cavendish, the knowledge of a thing like a mirror is, indeed, conditioned by the sort of motions that constitute the mirror, the motions that make it the thing it is; as such, mirror-knowledge and mirror-perception are very different from their human analogues. Even so, the mirror’s perception and knowledge are in some ways analogous to human perception and knowledge; both involve the object’s patterning out its own matter in a way, which copies or resembles an external object. Despite this similarity between a mirror and a human, the human being is composed of matter capable of many different kinds of perception and knowledge, whereas the mirror has a very limited ability to pattern out or reflect its environment. And the human has sufficient amounts of rational spirits uniting its parts to be able to conduct rational inquiry, whereas the rational matter of a mirror is very limited indeed.

This might sound as though she is walking back her commitment to panpsychism, but in fact she is not. For these parts or degrees of matter that possess varying levels of awareness are in fact entirely intermixed together in all things. She says, “there is a double perception in all parts of nature, to wit, rational and sensitive…. I believe there is sense and reason, or sensitive and rational knowledge, not only in all creatures, but in every part of every particular creature” (Ch. 36). Thus the rock, though it possesses a great deal of duller matter, also possesses sensitive and even rational spirits within. So Cavendish says,

self-motion is the cause of all the various…actions of nature; these cannot be performed without perception: for all actions are knowing and perceptive; and, were there no perceptions, there could not possibly be any such actions: for, how should parts agree, either in generation, composition, or dissolution of composed figures, if they had no knowledge or perception of each other? (Ch. 37, 167).

In short, Cavendish’s natural philosophy is materialist, vitalist and panpsychist, as well as anti-atomist and anti-mechanist. Unlike many of her opponents who favor mathematical physics, she takes the living things—and the limited awareness of the life sciences—as a model for her natural philosophy, as evidenced in her organicism, as well as her particular use of metaphor. In other words, she agrees with Descartes and Hobbes against the occult explanations of the Scholastics, with More and Van Helmont against the reductive mechanism of Hobbes and Descartes and with Hobbes and Stoic materialism against the incorporeal principles of More and Van Helmont.

d. God

Cavendish’s views on God are puzzling. She regularly repeats that we cannot assert the existence of things that are not observable material objects in the natural world and she does so in a way that might suggest to the modern reader that she does not believe in the immortality of the soul or the existence of an immaterial God. This would likely be a mistake, however, as there are several passages where she instead explains that she does not include God in her speculations, because we cannot speak with any degree of confidence about God’s nature. Though God is mostly absent from her work in the 1650s, in the Observations she says, “there is an infinite difference between divine attributes, and natural properties; wherefore to similize [sic] our reason, will, understanding, faculties, passions and figures etc. to God, is too high a presumption, and in some manner a blasphemy” (“Further Observations”, Ch 10, 215) and “God is incomprehensible, and above nature: but inasmuch as can be known, to wit, his being [i.e., that he exists]; and that he all-powerful…eternal, infinite, omnipotent, incorporeal, individual, immovable being” (*Further Observations*, Ch 11, 216-17). This certainly suggests that she takes God to exist or, at least, that she takes questions of his existence and nature to lie largely outside of the realm of natural philosophy and instead, perhaps, to be a matter of faith alone.

Nevertheless, we might speculate on the details of her views. As mentioned above, her views on the existence of a supernatural soul seem to be in tension with her other metaphysical commitments. Similarly, her views on the existence of an immaterial God seem similarly in tension. Interestingly, she attaches an erratum on the final page of her first work, Philosophical Fancies, apologizing to the reader for having omitted the appropriate pieties and references to God in her natural philosophical system. What is even stranger is that, when she would reprint and re-write that system in her 1656 Philosophical and Physical Opinions, she would again omit any references to God and instead include the same erratum a second time.

Even so, it is unlikely she thought of herself as an atheist. Perhaps, as some scholars have interpreted Thomas Hobbes, she simply believed that she had no business discussing the nature of God’s existence as that was not a matter of rational inquiry but mere faith. It should be noted, however, that her several discussions of fame suggest that she was not convinced that she would have an existence after her own death.

3. Political Philosophy

In addition to her substantial work on natural philosophy, Cavendish also wrote many other works in a variety of genres, from essays on social issues to poems and plays, even the fantastic utopian fiction The Blazing World. Unlike her work on natural philosophy, however, in which she sets out her views in relatively systematic ways and in philosophical treatises, her thoughts on social or political issues appear in works of fiction or in essays strongly conditioned by rhetorical devices. For example, in Orations of Divers Sorts, she speaks in a variety of voices, imagining several fictional interlocutors who present a number of positions on issues, without indicating the author’s own views. Similarly, in her fiction, she often has several characters advocate for philosophical positions, which complicates any attribution of that view we might make to the author herself. Indeed, in The Blazing World Margaret Cavendish, the Duchess of Newcastle, appears as a character, who advises the Empress of the Blazing World on how her society ought to be governed. In this case, we might feel fairly confident that the views espoused by the character of Cavendish accord with the author’s own, but such attributions should be made only tentatively. Despite the challenges presented by the genres, in which she chose to address these issues, we might still attribute certain general views to her. Among the recurring issues she addressed are aristocracy, gender and fame.

a. Religious Liberty

To see the difficulty in ascribing unambiguous views to Cavendish in these works, consider her thoughts on liberty and stability. In her 1666 fictional work The Blazing World, an Empress restructured her subjects into professional scientific societies. In the story, this change results in a breakdown of social harmony; the old institutions, by which the society had harmoniously functioned, begin to fail, there is strife and faction, and anarchy and civil war loom. Into this situation arrives the character of Margaret Cavendish who advises the formation of a single state sponsored religion. She further instructs the Empress in architectural details, indicating that an imposing cathedral be built from a magical burning stone found in this fictional world. Made, again, by some magical device, to float above the city, with a voice issuing from the Church with booming decrees that the old ways be reinstated, with everyone being born into and retaining the stations. The character of Cavendish proposes that doing so will cow the factious citizens and make them agree, so that cobblers will beget cobblers, soldiers give rise to soldiers and so on. When the Empress executes this plan social harmony is restored. This suggests to the reader that the author Cavendish opposes the sort of political progress that the Empress had proposed; the reader might also conclude that Cavendish supports the institution of a strong state Church.

Yet in her 1662 Orations of Divers Sorts, she states in one of her orations that, if the people have already adopted a variety of religious views, then the government should grant liberty of conscience—that is, freedom of religion—because doing so is the only way to maintain peace. Indeed she says explicitly there that the government should grant this liberty, because a failure to do so will result in anarchy. Then, in the next oration immediately after, she argues from a different perspective, claiming instead that liberty of conscience would lead to liberty in the state, which in turn would result in anarchy. Political liberty, she claims, undermines the rule of law, without which there can be no justice and thus there will be anarchy. Finally, she presents a third oration in defense of a middle view. There she argues that liberty of conscience is acceptable if it concerns only private devotions, but not if it disrupts the public. In other words, if their religious beliefs do neither violate any laws nor harm the public, then those beliefs are to be allowed. We might speculate that she intends this final, middle view to be taken as the author’s own, but it is not always clear, especially when, rather than presenting two views and concluding with a compromise, she instead presents six or seven different opinions, as she does on the question of whether women are equal to men. Even so, the reader may suspect that, in this case, the compromise view is closest to Cavendish’s own.

One feature that unites these varied discussions, however, is Cavendish’s fundamental commitment to the importance of political stability. In each of the above cases, she motivates her position by assuming that social and political stability must be preserved above all. All the orations, as well as the character of Cavendish in The Blazing World, seem to assume that political stability is the goal and that the sovereign ought to employ whatever means will be successful in securing it. Like Hobbes, then, Cavendish takes the primary function of the State to provide stability. This attitude recurs in her defenses of royalism and aristocracy.

b. Royalism and Aristocracy

Cavendish came from a family of royalists, served as a maid in waiting to Queen Henrietta Maria during her and Charles the Second’s exile from England at the hands of the republican revolutionaries of Cromwell and married one of Charles’s staunchest royalist supporters, William Cavendish, Duke of Newcastle. Her commitment to royalism and, more generally, to aristocracy, appears frequently in her writing. When she discusses how a country ought to be governed, she is unwavering in her view that states are best ruled by a King or Queen, who should come from the aristocracy.

One can draw an interesting analogy between her natural philosophy and her politics here. When discussing the distinction between health and illness in animals, Cavendish describes the organism as a body politic; the healthy body is one, in which each part of the body plays its role appropriately, whereas a diseased body is one, in which one or more parts are in rebellion, acting against their natures, to the detriment of the whole organism. Indeed, given her vitalism and panpsychism, she might describe disease in the human body and political unrest or rebellion in remarkably similar terms. In both cases, the whole body is composed of a variety of different parts, each with its own distinctive activity or motion. Each part knows its role, its place, in the body politic, yet each part is free to direct its motions in a way contrary to its natural activity. If a part chooses to do so, it will throw the orderly harmony of the whole out of balance. To expand upon this metaphysical account, we might say that, for Cavendish, people have certain stations—roles and places—in society from birth by nature and social harmony is achieved when the citizens conduct themselves according to their knowledge of their own distinctive activities. As long as the cobblers cobble, the soldiers defend, the judges judge and the rulers rule, social harmony will be maintained and each person can cultivate themselves accordingly.

Indeed, this seems to be one of the central features of Cavendish the character’s advice to the Empress in The Blazing World. Being a fantastical and quasi-science fictional story, The Blazing World features citizens of a variety of animal species, all sentient, capable of human language and so on. Originally, each species has their own distinctive roles, belonging to their own, species-specific guilds. It is to this world that Cavendish urges the Empress to return, one where the citizens are like different species, each with their own peculiar skills and roles received in virtue of what sorts of people their parents were. If the people of The Blazing World simply accepted the stations into which they were born, social harmony would be regained. It is difficult not to see this as a parable of the Restoration of Charles II and the English aristocracy; peace is restored to England by the return of the aristocracy. Moreover, in 1665, the year before The Blazing World was published, her family was restored their lands and her husband was advanced to Dukedom for his service to the King during the Civil Wars.

c. Gender

Cavendish is also described at times as an early feminist. To be sure, her own remarkable life as an author and philosopher leads many to take her as an exemplar; one might say she was a feminist in deed, if not always in word.

Beyond that, though, some scholars argue that her writings are feminist as well. For many of the reasons cited above, such claims can be complicated. Consider the seven orations on women in her Orations of Divers Sorts. There she presents seven speeches that take up a variety of positions. She begins by lamenting the fact that men possess all the power and women entirely lack it. In a subsequent oration, she speculates that women lack power in society, due to natural inferiority. She then counters in the next oration that women might be able to achieve as much as men were they given the opportunity to engage in traditionally masculine activities. But the next speaker claims that, were women to imitate men in this way, they would become “hermaphroditical.” Instead, this orator suggests, women should cultivate feminine virtues such as chastity and humility. In the very next oration, however, the orator suggests that feminine virtues are inferior to masculine, so women should pursue masculine virtues instead. She concludes the series of orations on this topic with a new position, arguing that women are in fact superior to men because women, through their beauty, can control men.

What is the reader to make of this series of orations? It seems likely that Cavendish affirms the following empirical facts about her society: women lack power; women could gain fame and even perhaps power if they pursued masculine virtues; they might even be equally capable as men in cultivating these virtues; yet women would be despised if they did pursue these virtues; if women cultivated feminine virtues, they would not be despised and could even acquire a kind of indirect power, but such a state of affairs is ultimately inferior to the power men possess. What is less clear is whether Cavendish really believes that the pursuit of so-called masculine virtues would somehow harm women by causing them to deny their natures. In other words, it is not clear from these orations whether Cavendish thinks women are naturally inferior to men. In her earlier Worlds Olio, on the other hand, she seems less ambivalent, claiming that women are in general inferior to men at rhetoric. Some women may cultivate skill in rhetoric to rival and even exceed that of men, but they are few, she claims, in this work.

Some readers might point to The Blazing World, and to the power of the Empress or the success of the character of Cavendish as a political adviser. It is true that the Empress leads her people in a successful naval battle, defeating a mortal enemy of her homeland. A similar event occurs in her story Bell in Campo. Even so, the considerations above suggest that social harmony is restored because she returns to aristocratic values. After all, the notion that a woman might lead an empire, even into war, would not be so foreign to an English subject in the 1660s, given that Queen Elizabeth ruled just a few decades before and had overseen the important naval defeat of the Spanish Armada.

From her first work and throughout her career, Cavendish engaged the issue of women in her writing, reflecting on her own experience as a woman and how, or whether, it shaped her writing or philosophy. Thus, with her impressive life and regular consideration of the relevance of gender to her thought, Cavendish can be seen as an important precursor for later more explicitly feminist writers, even if she herself might not be aptly so described.

4. References and Further Reading

a. Cavendish’s Works in the 17th Century

Only the first publication is listed for each work; Cavendish revised and reprinted several of her works multiple times over the years. So, for example, Observations Upon Experimental Philosophy first appeared in 1666 but reappeared, with the addition of The Blazing World, in 1668. And Grounds of Natural Philosophy is a substantially revised version of her earlier Philosophical and Physical Opinions, itself, which contained her early Philosophical Fancies as its first part.

Cavendish, Margaret, Philosophical Fancies, London: printed by Thomas Roycroft for J. Martin and J. Allestrye, 1653.
Cavendish, Margaret, The World’s Olio, London: printed for J. Martin and J. Allestrye, 1655.
Cavendish, Margaret, Philosophical and Physical Opinions, London: printed for J. Martin and J. Allestrye, 1655.
Cavendish, Margaret, Nature’s Pictures, London: printed for J. Martin and J. Allestrye, 1656.
Cavendish, Margaret, Plays, London: printed for J. Martin, J. Allestrye and T. Dicas, 1662.
Cavendish, Margaret, Orations of Divers Sorts, Accommodated to Divers Places, London: printed by W. Wilson, 1662.
Cavendish, Margaret, ‘Bell in Campo’, in Playes, London: J. Martin, J. Allestrye and T. Dicas, 1662.
Cavendish, Margaret, Sociable Letters, London: printed by William Wilson, 1664.
Cavendish, Margaret, Philosophical Letters, London: possibly printed by David Maxwell, 1664.
Cavendish, Margaret, Observations Upon Experimental Philosophy, London: printed by Anne Maxwell, 1666.
Cavendish, Margaret, The Description of a New World, Called the Blazing World, London: printed by A. Maxwell, 1666.
Cavendish, Margaret, Life of William, London: printed by A. Maxwell, 1667.
Cavendish, Margaret, Grounds of Natural Philosophy, London: printed by A. Maxwell, 1668.

b. Modern Editions of Her Works

Cavendish, Margaret, The Description Of A New World, Called The Blazing World And Other Writings, ed, Kate Lilley. London: William Pickering, 1992.
Cavendish, Margaret, Paper Bodies: A Margaret Cavendish Reader, eds. Sylvia Bowerbank and Sara Mendelson. Peterborough, ON: Broadview Press, 2000.
Cavendish, Margaret, Observations upon Experimental Philosophy, ed. Eileen O’Neill. Cambridge: Cambridge University Press, 2001.
Cavendish, Margaret, Margaret Cavendish: Political Writings, ed. Susan James. Cambridge: Cambridge University Press, 2003.

c. Secondary Literature

Battigelli, Anna, 1998, Margaret Cavendish and The Exiles of the Mind, Lexington: The University Press of Kentucky.
- An overview of Cavendish’s life and works from a scholar of English literature, with discussions on genre and rhetorical devices in her works.
Boyle, Deborah, 2006,“Fame, Virtue, and Government: Margaret Cavendish on Ethics and Politics,” Journal of the History of Ideas, 67: 251–289.
- One of the few discussions of Cavendish’s ethics, with a productive focus on fame.
Boyle, Deborah, 2013, “Margaret Cavendish on Gender, Nature, and Freedom,” Hypatia 28 (3): 516-532.
- An excellent account of the complexities of Cavendish on gender.
Broad, Jacqueline, 2002, Women Philosophers of the Seventeenth Century, Cambridge: Cambridge University Press.
- This text contains a chapter on Cavendish.
Clucas, Stephen, 1994, “The Atomism of the Cavendish Circle: A Reappraisal,” The Seventeenth Century, 9: 247–273.
- Clucas argues that Cavendish never really gave up atomism.
Cunning, David, 2006, “Cavendish on the Intelligibility of the Prospect of Thinking Matter,” History of Philosophy Quarterly, 23: 117–136.
- A discussion of Cavendish and the notion of thinking matter, with connections to contemporary philosophy of mind.
Cunning, David, 2010, “Margaret Lucas Cavendish,” Stanford Encyclopedia of Philosophy.
Detlefsen, Karen, 2006, “Atomism, Monism, and Causation in the Natural Philosophy of Margaret Cavendish,” in Daniel Garber and Steven Nadler (eds.), Oxford Studies in Early Modern Philosophy, 3: 199–240
- A long and thorough exploration of some themes in Cavendish’s metaphysics. She refutes Clucas on atomism and provides an insightful analysis on causation.
Detlefsen, Karen, 2007, “Reason and Freedom: Margaret Cavendish on the Order and Disorder of Nature,” Archiv für Geschichte der Philosophie, 89: 157–191.
- Detlefsen notes that matter itself must be free of necessity, in order to explain the disorder in nature that Cavendish allows for, especially in disease, in part via a ‘body politic’ analogy.
Detlefsen, Karen, 2009, “Margaret Cavendish on the Relationship Between God and World,” Philosophy Compass, 4: 421–438.
- An overview of Cavendish’s views on God.
Duncan, Stewart, 2013, “Cavendish and the Divine, Supernatural, Immaterial Soul,” The Mod Squad: A Group Blog in Modern Philosophy, Accessed November 4, 2014.
- A discussion and consideration of the nature and role of the supernatural soul in Cavendish’s metaphysics.
Duncan, Stewart, 2012, “Debating Materialism: Cavendish, Hobbes, and More,” History of Philosophy Quarterly 29 (4): 391-409.
- An analysis of Cavendish that clarifies and contextualizes her materialism vis-à-vis Hobbes and More, with whom her thought shares some important similarities.
Hutton, Sarah, 1997, “In Dialogue with Thomas Hobbes: Margaret Cavendish’s natural philosophy,” Women’s Writing, 4: 421–432.
- Cavendish’s debt, and response, to Hobbes’s metaphysics.
James, Susan, 1999, “The Philosophical Innovations of Margaret Cavendish,” British Journal for the History of Philosophy, 7: 219–244.
- An excellent overview of the major themes in Cavendish’s metaphysics.
James, Susan, 2003, “Introduction,” in Margaret Cavendish: Political Writings, ed. Susan James, Cambridge: Cambridge UP (2003).
- An overview of Cavendish’s social and political themes.
Kroetsch, Cameron, 2013, “List of Margaret Cavendish’s Texts, Printers, and Booksellers,” The Digital Cavendish Project, Accessed November 4, 2014.
- A detailed account of the printing and publishing of Cavendish’s works.
Lascano, Marcy. “An Introduction to Margaret Cavendish, or ‘Why You Should Include Margaret Cavendish in Your Early Modern Course and Buy the Book.’” The Mod Squad, A Group Blog in Early Modern Philosophy. Accessed July 14, 2014.
- Lascano makes a compelling case for the inclusion of Cavendish in Early Modern Philosophy survey courses.
Lewis, Eric, 2001, “The Legacy of Margaret Cavendish,” Perspective on Science, 9: 341–365.
- An overview of Cavendish’s reception, both among her contemporaries and ours. Valuable in part for its identification of lacunae in recent scholarship.
Michaelian, Kourken, 2009, “Margaret Cavendish’s Epistemology,” British Journal for the History of Philosophy, 17: 31–53.
- The only extended discussion of Cavendish’s epistemology, with a special focus on her distinction of internal and external knowledge.
O’Neill, Eileen, 1998, “Disappearing Ink: Early Modern Women Philosophers and Their Fate in History,” in Janet A. Kourany (ed.), Philosophy in a Feminist Voice, Princeton: Princeton University Press.
- The locus classicus for discussion of the way in which women philosophers were written out of histories in the past two centuries.
O’Neill, Eileen, 2001, “Introduction,” in Margaret Cavendish, Observations Upon Experimental Philosophy, Eileen O’Neill (ed.), Cambridge: Cambridge University Press, x-xxxvi.
- An excellent account of Cavendish’s mature thought, in what is arguably her greatest work.
Sarasohn, Lisa, 2010, The Natural Philosophy of Margaret Cavendish: Reason and Fancy During the Scientific Revolution, Baltimore, MA: The Johns Hopkins University Press.
- An examination of Cavendish’s natural philosophy by an historian of science.
Whitaker, Katie, 2002, Mad Madge: The Extraordinary Life of Margaret Cavendish, Duchess of Newcastle, the First Woman to Live by Her Pen, New York: Basic Books.
- An entertaining biography of Cavendish.

Author Information

Gwendolyn Marshall
Email: eumarsha@fiu.edu
Florida International University
U. S. A.

Gottfried Wilhelm Leibniz (1646-1716)

Widely hailed as a universal genius, Gottfried Wilhelm Leibniz was one of the most important thinkers of the late 17^th and early 18^th centuries. A polymath and one of the founders of calculus, Leibniz is best known philosophically for his metaphysical idealism; his theory that reality is composed of spiritual, non-interacting “monads,” and his oft-ridiculed thesis that we live in the best of all possible worlds. Though these ideas may make his philosophy seem exceedingly abstract, Leibniz had keen interest in less abstract fields, such as empirical physics and jurisprudence. He also made great contributions to logic, with some considering him the greatest logician since Aristotle.

Due to his belief in a rationally ordered universe, his commitment to the principle of sufficient reason, and his acceptance of innate ideas, Leibniz is rightly ranked along with Descartes and Spinoza as one of the seminal early modern rationalists. Leibniz stands out in this tradition, however, for his novel efforts to find compatibility between classical and modern thought. He retained ancient and scholastic notions such as substantial form and final cause, while at the same time attempting to improve upon the mechanical philosophies of Hobbes, Spinoza, and Descartes. He also hoped his comprehensive philosophical system would serve as a common ground for uniting the determinedly divided Christian denominations in Europe. Such irenic pursuits make Leibniz a unique transitional figure in the history of philosophy. He has been called both the last in the lineage of great Christian Platonists and the first thinker to tackle the intellectual problems of modern Europe. After an introduction to his life and works, this article examines the key elements of Leibniz’s ambitious philosophical program.

Life and Writings
Key Principles
Metaphysics
Theodicy
Epistemology
Ethics
1. Intellect and Will
2. Justice and Charity
References and Further Reading
1. Primary Sources: Leibniz Texts and Translations
2. Secondary Sources

1. Life and Writings

Leibniz was born on 1 July 1646, during the waning years of the Thirty Years’ War, in the Lutheran town of Leipzig. His father, Friedrich, was professor of moral philosophy at the University in Leipzig. His mother, Catherina Schmuck, was the daughter of a law professor. Leibniz grew up in an educated, and by all accounts, orthodox Lutheran environment. Between the books of his father, those of his maternal grandfather, and the contributions of Friedrich’s bookselling former father-in-law, Leibniz had access to an impressive library. At a young age, he gained a love for classical literature and the writings of the Church Fathers.

From 1661-63, Leibniz pursued university studies in Leipzig, with a brief stay at the university in Jena in 1663. At the time, the curriculum at these universities was still largely scholastic with some pedagogical practices bearing traces of the Ramist encyclopedic tradition. Leibniz’s main teachers, Jakob Thomasius in Leipzig and Erhard Weigel in Jena, were Aristotelians with eclectic interests. Leibniz had his own eclectic interests, having gained some, mostly second-hand, familiarity with modern mechanical philosophy. Later in his life, he recounted a fateful stroll through the Rosental in Leipzig in which he debated the respective merits of scholastic and modern thinking. “Mechanism finally prevailed,” he recalled, “and led me to apply myself to mathematics” (G III, 606). Though steeped in classical and scholastic learning, Leibniz at quite a young age fashioned himself a man of the times.

Leibniz went on to pursue a degree in law, earning his doctorate from the University in Altdorf in 1666. His writings from his student years include his bachelor’s dissertation, A Metaphysical Disputation on the Principle of Individuation, an early work in combinatorial logic titled A Dissertation on the Art of Combinations, and works on legal theory.

After short stints in Nuremburg and Frankfurt, Leibniz took his first major employment in the Catholic court of the Prince-Archbishop of Mainz, Johann Philipp von Schӧnborn in 1668. Leibniz was tasked with reforming legal codes and statutes. During his time in Mainz, Leibniz struck up an important relationship with Baron Johann Christian von Boineburg, the central statesman in the Mainz court. Boineburg appreciated Leibniz’s considerable talents and set before him the task of solving the day’s most pressing philosophical and theological questions. Through his association with Boineburg, Leibniz began to see the challenges modern philosophy, especially the materialism of Gassendi and Hobbes, posed to belief in the immortality of the soul, to belief in God and natural law, and to both Catholic and Lutheran understandings of the Eucharist. Leibniz thus from 1668-70 began working on a number of preliminary studies meant to be part of a comprehensive work entitled Catholic Demonstrations. Though this dreamed-of magnum opus never materialized, Leibniz never abandoned his goal of developing a modern philosophy congenial to Christian theology. In addition to his Catholic Demonstrations writings, Leibniz’s Elements of Natural Law, written between 1669 and 1671, also contributed to these efforts. Furthermore, during this period Leibniz intensified his interest in physics, writing the Theory of Abstract Motion and the New Physical Hypothesis, and penning an unanswered letter to Thomas Hobbes on the Englishman’s physical theory as it relates to the philosophy of mind. Leibniz in hindsight found these youthful physical works unimpressive, but they attest to the diversity of his interests.

Mainz opened Leibniz to an extraordinarily broad range of philosophical concerns; his most intense period of intellectual development soon followed. In 1672, Leibniz was dispatched to Paris on a diplomatic mission as well as on personal business for Boineburg. Paris exposed Leibniz to learning, resources, and interlocutors the likes of which he had never seen. He had access to the unpublished writings of Descartes and Pascal. He met with leading Parisian intellectuals Antoine Arnauld and Nicholas Malebranche. He studied mathematics under the Dutch mathematician Christiaan Huygens. He twice visited London, in 1673 and 1676, meeting with the mathematicians and physicists of the Royal Society. Leibniz’s friend Walther von Tschirnhaus, though forbidden from showing Leibniz an advanced copy, apprised Leibniz of many of the contents of Spinoza’s Ethics. This led Leibniz, upon leaving Paris in 1676, to make an excursion to The Hague to visit Spinoza.

Paris and London offered Leibniz the opportunity to establish himself as a rising star in the European intellectual orbit and Leibniz did not squander his chance. By 1675 he had developed the infinitesimal calculus, only three years after he started the serious study of contemporary mathematics. He also continued to write on a wide range of philosophical topics. His Confession of a Philosopher of 1672-3 was his first response to the problem of evil and to the question of determinism. His most important collection of metaphysical papers from the period, De summa rerum, contains some of Leibniz’s early responses to Spinoza’s monism, with budding reflections on the relationship between mind and body, on the nature of the continuum, and on universal harmony.

In 1676, Leibniz accepted a position in the court of Duke Johann Friedrich of Hanover, employed mainly to serve as court librarian and to consult on engineering projects in the Harz mines. After his taste of the intellectual scenes in Paris and London, Leibniz found life in Hanover a disappointment. Despite his lack of professional prospects, Leibniz would in the ensuing decade sharpen his intellectual vision. He published a number of important essays on mathematics, epistemology, and physics in the new journal Acta Eruditorum. In 1686, while it snowed in the Harz, Leibniz composed “a little discourse on metaphysics.” Now published without the diminutive “little,” the Discourse on Metaphysics is widely considered Leibniz’s first mature philosophical statement. Leibniz sent a summary of the Discourse to Arnauld, sparking an extended and illuminating correspondence between them on issues of freedom, causality, and occasionalism.

In 1689, Leibniz travelled to Italy on official business, researching possible ancestral ties to the Guelf Dukes of Hanover. Leibniz, never one to let official duties interfere with his own intellectual agenda, used the opportunity to pitch his metaphysics to leading Catholic intellectuals. He also wrote works on cosmology in efforts to exonerate the Copernican system from Vatican censure.

Leibniz returned in 1690 to Hanover, which remained his home base until his death. Leibniz continued to write prodigiously and we can mention here only a small sample of his works. 1695 saw the publication of the first part of his Specimen of Dynamics and his New System of Nature. The former work included Leibniz’s reflections on the nature of force, and in many ways was developed in response to Newton’s Principia Mathematica; the latter was Leibniz’s first public presentation of his theory of pre-established harmony. In 1703, Leibniz began work on The New Essays on Human Understanding, a book-length dialogue in response to Locke’s Essay on Human Understanding. The only book Leibniz published during his lifetime, the Theodicy, was released in 1710. In this work, Leibniz defends his thesis that we live in best of all possible worlds and defends the reasonableness of Christianity against the fideism and skepticism of Pierre Bayle. In 1714, Leibniz wrote the Monadology, the last comprehensive summary statement of his philosophical views.

Throughout his years in Hanover, Leibniz maintained a stunning number of epistolary correspondents. Notable among these were Samuel Clark, Burchard de Volder, Johann Bernoulli, Bartholomew Des Bosses, and Christian Wolff. Leibniz also corresponded and often met with Sophie, Electress of Hanover, and her daughter Sophie Charlotte, Queen of Prussia. These women encouraged, and in many ways made possible, Leibniz’s philosophical pursuits while employed at the court.

Leibniz’s final years were clouded by charges that he stole ideas from the papers of Isaac Newton when developing the calculus in the 1670s. Leibniz has been cleared of the charges and it is now accepted that the two men developed the calculus independently. Leibniz died on 14 November 1716 after struggles with gout and arthritis.

Unlike the other major philosophical lights of his era, and despite having written more than any of them, Leibniz produced no magnum opus. He seemed most at home in dialogue, in correspondence, and in controversy. The Discourse on Metaphysics and Monadology are his most commonly studied works in metaphysics. Scholars disagree about the extent to which the two works are in accord, but they together provide a solid grounding in Leibniz’s thought. The Theodicy is a classic of philosophical theology and the New Essays provides the fullest account of Leibniz’s epistemology. This article will summarize Leibniz’s philosophy mainly as it is presented in these works. It would be a mistake, however, to think that one can get a full picture of Leibniz’s interests from these works and the reader is encouraged to consult the many excellent edited selections of Leibniz’s texts.

2. Key Principles

Several key principles form the core of Leibniz’s philosophy. Though Leibniz never lists these serially in the manner of, for instance, the axioms of Spinoza’s Ethics, the principles nonetheless shape Leibniz’s thinking and ground his major claims. He refers to them throughout his writings and we shall refer to them throughout our discussion. Though each of these principles merits further analysis in its own right, we introduce them only briefly here. Truly unique to Leibniz is not so much these principles in themselves as the use to which he collectively puts them.

In the Monadology, Leibniz writes that we reason “based on two great principles” (M 30). The first of these is the principle of contradiction, which deems every contradiction to be false. Classically stated, the principle of contradiction holds that something cannot be both “x” and “not x” at the same time and in the same respect. Aristotle claimed that all logic and reasoning presupposes the principle of contradiction and Leibniz sees no reason to think otherwise.

The second great principle of reason is the principle of sufficient reason, “by virtue of which we consider that we can find no true or existent fact, no true assertion, without there being a sufficient reason why it is thus and not otherwise, although most of these reasons cannot be known to us” (M 31). The classical statement of the principle of sufficient reason is nihil sine ratione: there is nothing without reason or cause. Leibniz holds that every state of affairs has an explanation, even if we must admit that we often do not have sufficient information to provide an explanation. The principle of sufficient reason assumes great prominence in Leibniz’s philosophy, most notably in his accounts of substance, causality, freedom, and optimism.

Closely related to the principle of sufficient reason is the principle of the best. This principle holds that rational beings always choose, and act for, the best. In this way, reason is teleologically ordered towards goodness. On Leibniz’s thinking, if reason did not opt for what is best, it would act arbitrarily; it would not have a sufficient reason for choosing one option over another, thus violating reason’s second great principle. Goodness provides the sufficient reason for rational choice. The principle of the best manifests itself differently in the cases of God and created minds. God, whom Leibniz considers “an absolutely perfect being” (DM 1), and who thus knows what is best, always acts in the best way. Created minds, who have a finite degree of perfection and thus limited knowledge of what is best, always act according to what seems the best from their limited perspectives.

The predicate-in-notion principle provides Leibniz’s notion of truth: praedicatum inest subjecto. In any true, affirmative proposition the predicate is contained in the subject. In order for the proposition, “Leibniz is a mathematician,” to be true, the idea “mathematician” must somehow be included in the idea “Leibniz.” Leibniz’s interpretation of the predicate-in-notion principle, we shall see, has far-reaching consequences for his metaphysics. Somewhat relatedly, Leibniz affirms the principle of the identity of indiscernibles, which states that any two objects sharing all properties are in fact the same, identical object. Each individual object contains some individuating characteristic. Important for Leibniz, this individuating characteristic must be something intrinsic to the individual, and not simply a separation in space and time, which Leibniz considers purely extrinsic denominations. The principle of the identity of indiscernibles is tied closely to the predicate-in-notion principle insofar as the latter makes intrinsic properties the basis of all truth and the former makes such properties the basis for identity and individuation.

A final key principle worth noting is the principle of continuity. “Nothing takes place suddenly, and it is one of my great principles that nature never makes leaps,” Leibniz writes in the New Essays. “I call this the Law of continuity” (NE 56). All change is continuous; there is never a leap, but rather a series of intervening stages. This principle is especially germane to Leibniz’s development of the infinitesimal calculus, but relevant too to his metaphysics and epistemology.

3. Metaphysics

a. Substantial Forms

One of the earliest intellectual projects Leibniz set for himself was to determine the proper relationship between the Aristotelian philosophy taught at his university in Leipzig and the new, mechanical philosophy espoused by thinkers like Galileo, Descartes, and Hobbes. Leibniz embraces modern, mechanical physics as the proper method for investigating nature, yet he is distinctive among 17^th century thinkers for the depths of his efforts to retain several key metaphysical concepts of ancient and medieval philosophy. Chief among these concepts is the Aristotelian idea of substantial form. Though Leibniz does not adopt the traditional understanding of substantial form in its details, his grappling with the legitimacy of this notion sets the trajectory for much of his metaphysics.

Aristotle, with the medieval scholastics following him, argues that any individual thing consists of a substantial form, which determines the kind of thing it is, and matter, which individuates the thing and makes it numerically distinct from other like substances. So, a particular squirrel consists of the universal form “squirrel” shaping and directing particular material stuff. In the 17^th century, the idea that substantial forms should enter into physical accounts of nature becomes especially odious. Citing “squirrelness,” the moderns maintain, tells us nothing regarding the activity of a squirrel. For thinkers such as Hobbes and Descartes, substantial forms are useless fictions, at best superfluous and at worst misleading. The mathematically-based, mechanical laws governing matter in motion suffice to explain the whole of nature, with no need to take into account the kind of thing under investigation. What counts in describing the behavior of a squirrel is not its “squirrelness,” but the forces its limbs exert on one another, the pressure differentials in its circulatory system, and other quantifiable data. This approach makes it possible to have a single method for investigating all natural phenomena.

Leibniz agrees that substantial forms have no use in physics, but he insists metaphysical accounts of reality require something like substantial forms. Mechanical explanation adequately addresses the activity of the physical world, but not its underlying nature. For Leibniz, the corporeal world its very essence depends on incorporeal principles. Both Hobbes’ purely materialist metaphysics and the strict substance-dualism of Descartes fail to properly appreciate nature’s dependence on purely metaphysical entities. Ultimately, Leibniz’s defense of substantial forms provides the first step in the development of his idealist metaphysics.

Leibniz offers several defenses of substantial forms, in which he tries not to revive Aristotle’s notion of form wholesale, so much as to prove the existence of irreducible, incorporeal entities. One argument turns on the principle of sufficient reason: the fact that the corporeal world itself cannot offer any explanation for its particular features. Why does a given body occupy so much space, have a particular shape, or move in just this way? By limiting oneself to mechanical explanation, one can either say that body A’s features were caused by body B, or one can say that body A has had its particular constitution from eternity. The former approach leads to an infinite regress in explanation, which is to say it never arrives at an explanation at all. There is always yet another body requiring explanation. The latter approach, for Leibniz, likewise offers no real explanation. Citing eternity as a reason, he feels, amounts to answering the question “Why is A, x?” with “Simply because A is x and always has been x,” dodging the question. Since the corporeal world does not contain sufficient explanation for its own features, Leibniz concludes that the cause of such features lies in incorporeal principles.

In a second defense of incorporeal substantial principles, Leibniz denies the Cartesian distinction between the primary qualities of bodies and secondary qualities such as color and temperature (DM 12). Descartes, anticipating Locke, argues that the secondary qualities of bodies are relative to the perceiving subject. For instance, as we observe in cases of color-blindness, one person perceives an object as red and another person the same object as green. Color, the argument goes, is thus not a property of the body itself, but depends on the interaction between object and perceiver. Descartes holds, however, that size, shape, and motion are not relative properties, but constitute the essence of body itself. Leibniz, believing that space and time are relative, counters that these primary properties which depend on space and time, and also include something relative to perception. No perceived material quality, therefore, accounts for what a body essentially is. It follows that incorporeal principles must be the real metaphysical building blocks of reality.

A third argument for substantial forms comes in Leibniz’s treatment of force. Descartes had confused force with what we would call momentum. He measured force by multiplying mass by velocity, not by acceleration, or the square of velocity. For Leibniz, this error on the part of Descartes points to an important fact about reality. Motion, measured by mv, is relative. When several objects change positions, one cannot with certainty attribute motion to one object or another. Force, however, has more reality. We have sufficient reason to attribute it to one body over others. In other words, we have more certainty which body in a system is the proximate cause of changes in other bodies. Force, therefore, has more reality than motion, and yet force is not corporeal in the way both mass and velocity are since force is not extended. Though Descartes’ confusion seems simply an error in calculation, in it Leibniz sees additional indication that the realities grounding corporeal objects are not themselves corporeal.

b. Substance as Complete Concept

Though his defense of incorporeal substances allows Leibniz to partially reconcile pre-modern and modern thought, Leibniz still needs to articulate his own account of the nature of these substances. In §8 of the Discourse on Metaphysics, Leibniz takes up the task of defining individual substance. He begins with Aristotle’s definition, which states that when many things are said of a subject, yet it is said of nothing else, this subject is rightly called an individual substance. So, for instance, we say of Alexander the Great that he is Macedonian and ambitious, but we do not say of anything else that it is Alexander the Great. Thus, Alexander is an individual substance.

Leibniz deems this Aristotelian definition of substance merely logical. It tells us something about the structure of thought and language, but does not provide a metaphysical account of substance. To move to a proper metaphysical understanding, Leibniz believes we must look more closely at the nature of predication. “All true predication,” he writes, “has some basis in the nature of things.” Here, Leibniz shows his belief that there is isomorphism between metaphysics and logic. All true propositions have an ontological basis. All we can truly say of Alexander the Great is included in Alexander’s nature.

The idea that each substance includes all the predicates which belong to it is, Leibniz takes it, simply a metaphysical restatement of the predicate-in-notion principle. On the basis of this principle, Leibniz arrives at his notion of substance as a complete concept:

The nature of an individual substance or of a complete being is to have a notion so complete that it is sufficient to contain and to allow us to deduce from it all the predicates of the subject to which the notion is attributed. (DM 8)

Leibniz’s thought is essentially this: if one had a sufficiently powerful intellect, one could deduce from the idea of any individual substance all that could ever be said of it, in just the same way that if one has a clear and distinct idea of a circle, one can deduce all the properties of a circle. From the very concept of Alexander the Great, the infinite intellect of God can deduce all Alexander’s qualities, including that he is the vanquisher of Darius. To be a substance, then, is to have such a corresponding complete concept. Every substance, as it were, includes its biography.

Beginning in the 1690s, “monad” becomes Leibniz’s preferred term for a complete, incorporeal, individual substance. The term monad derives from the Greek mónos, meaning alone or solitary. Leibniz introduces the term to underscore the fact that individual substances are not only complete, but also simple. As Leibniz’s defense of substantial forms showed, the material realm needs grounding in something incorporeal. Matter, however, can be infinitely divided. Leibniz therefore reasons that there must be infinite simple monads populating the world at even the most infinitesimal levels. Leibniz likens the fullness and complexity of the monadic universe to “nested” ponds and gardens.

Each portion of matter can be conceived as a garden full of plants, and as a pond full of fish. But each branch of a plant, each limb of an animal, each drop of its humors, is still another such garden or pond. (M 67)

Monads are thus “spiritual atoms,” the incorporeal building blocks of all reality. They are the complete entities which merit the designation “substance.”

It is in the nature of each monad to have its own internal principle of activity. As Leibniz writes, “activity is of the essence of substance in general” (NE 65). Beginning in the 1690s, Leibniz refers to the internal activity of substances as their primitive active forces. Defining substance in terms of activity is important to Leibniz for several reasons. For one, this position is of a piece with his contention that the activity of corporeal entities is grounded in that of incorporeal entities. In order to play this role, incorporeal monads must themselves be active. More importantly, Leibniz broaches the discussion of substance in the Discourse on Metaphysics with the goal of differentiating the actions of God from those of creatures. In arguing that each substance has its own primitive active force, Leibniz distances himself both from Spinoza’s monism and Malebranche’s occasionalism, the former holding that individual things are not themselves substances but rather modes of a single divine substance, and the latter invoking God’s power to explain the ordinary doings of creatures. To Leibniz, each of these positions insufficiently appreciates that each substance is complete and active in itself. For, were created substances to lack activity, there would be no distinction between actual, created substances and the possible yet uncreated substances in God’s mind, a modal distinction central to Leibniz’s theodicy.

c. Causality and Pre-Established Harmony

If each substance is complete in itself and requires no other substance to be understood, it follows that every finite substance is causally independent of all save God. Each created substance is, as Leibniz says, “like a world apart” (DM 14). But how can this be? How can Alexander defeat Darius without being related to, and thus in a sense dependent on, Darius? More broadly, how can Leibniz square his “world apart” language with our experience of living in a world with a plethora of cause and effect relationships between substances?

Leibniz responds to these questions by offering a unique theory of causal interaction, which he calls at different points either the theory of pre-established harmony or the hypothesis of concomitance. The theory holds that although no two substances directly influence each other, they can express each other, that is, the activity of one can be reflected in the concept of the other. Alexander, we typically say, caused Darius’ death. Leibniz does not object to this kind of causal attribution, but insists that at the metaphysical level, what we call causality amounts to no more than this: it is in the nature of Alexander to be he who defeats Darius and it is likewise in the nature of Darius to be him defeated by Alexander. These two independent substances, as Leibniz puts it, “mirror” each other, so that at the exact moment it can be predicated of Alexander that he is the vanquisher of Darius, it can likewise be predicated of Darius that he is the victim of Alexander.

Hence, although each substance is “like a world apart,” substances form a common world by mirroring, or expressing, one another. God ordains at the moment of creation—in Leibniz’s terms he “preestablishes”—that the perceptions of all creatures in the world harmonize with one another, that there is strict alignment so that at the moment I perceive myself as tapping my friend on the shoulder, she perceives herself as being tapped. Leibniz is fond of likening the relationship between substances to that between two perfectly synchronized clocks which remain aligned despite never touching each other. Causal interaction is no more than what we find in these clocks, the harmonized activity of independent entities. Leibniz famously describes independent monads as “windowless,” neither letting in any outside influence nor issuing any influence (M 7). This is the Leibnizian universe: windowless monads in pre-established harmony.

The theory of pre-established harmony includes the rather strong claim that each substance is harmonized with all other substances in the world. This must be the case if the substances are to form a common world with a common history, since mutual expression is the only possible relation between independent substances. Does this mean that my concept expresses the nature of even a fish living thousands of years ago? In a word, yes. Though Alexander and Darius express each other much more distinctly than I express the ancient fish, my concept must bear traces of the existence of that fish since we are members of a common world. This might seem fantastical, even absurd, but if one considers how much one’s own experience reflects the activities and efforts of one’s predecessors, and how much their activities were constrained by their natural environment, then perhaps one can begin to appreciate Leibniz’s insight that every single substance bears traces of, or faintly expresses, the whole universe, past, present, and future.

Leibniz’s explanation of causality via pre-established harmony and mutual expression has led some commentators to accuse Leibniz of what they call the “mirroring problem.” They object that if substance A expresses the essence of all others, yet these in turn express substance A, then the world is like a hall of mirrors which reflect one another but no concrete images. In this scenario, the concept of any given substance is not complete, as Leibniz would hold, but empty. Although this line of objection points to some of the complexities and potential difficulties in the theory of pre-established harmony, it merits mention that Leibniz sees each substance as fundamentally mirroring God. “It can even be said that every substance bears in some way the character of God’s infinite wisdom and omnipotence and imitates him as much as it is capable” (DM 9). Stating that each substance reflects God’s essence, while also mirroring all other substances, does not directly respond to the mirroring problem. Noting that each substance reflects God’s essence by virtue of its own internal individuating activity perhaps provides a more satisfying response, and it is likely that Leibniz’s solution to the mirroring problem lies in this direction.

d. Idealism

Leibniz’s defense of incorporeal monads as the foundation of the physical world, his notion of substance as a complete concept, and his account of causality via pre-established harmony all contribute to Leibniz’s brand of idealism. By idealism, we mean the thesis that nothing exists in the world but minds and their ideas. As Leibniz summarizes his idealism: “There is nothing in the world but simple substances and in them perception and appetite” (AG 181).

By perception, Leibniz means the “passing state which involves and represents a multitude in the unity or in the simple substance” (M 14). Since each substance is metaphysically complete in itself and “like a world apart,” all changes in its state arise spontaneously, that is, without the intervention of other substances. Yet since each substance mirrors all others, it must contain a multiplicity of representations within itself. The sequence of spontaneous representations is what Leibniz calls perception. Importantly, Leibniz posits that all beings in the world perceive. This is yet another consequence of the fact that mutual representation is the only relation between monads in pre-established harmony. What distinguishes rational, conscious minds from all other substances is not perception, but apperception, or the ability to reflect on their mental processes.

Of appetite, Leibniz writes: “The action of the internal principle which brings about the change or passage from one perception to another can be called appetition; it is true that the appetite cannot always completely reach the whole perception toward which it tends, but it always obtains something of it. And reaches new perceptions” (M 15). The best analogy here is perhaps a mathematical function, where appetite is the analogue to the function equation, or the law of the series, and where each perception represents a discrete value. Leibniz’s point is that each substance has an orientation which defines it and which governs the transition between perceptions. This does not mean that each individual can fully choose or determine the sequence of its perceptions, since it is constrained by the need to faithfully represent the activity of other substances. Appetite does indicate, however, that there is a striving or tendency unique to each substance which shapes the manner in which it reflects the world. Hence Leibniz describes substances as so many distinct “viewpoints” on the universe (DM 14; M 57).

In composite substances, such as living animals whose various parts contribute to the well-being of the entire organism, simple monads unite under the direction of a dominant monad (M 70). Each monad retains its substantial independence, but living organisms display an especially high level of intermonadic harmony. Though Leibniz does not define in detail the operations of dominant monads, these monads must at least subsume others under their own internal principles or appetites. The activity of subordinate monads thereby serves the goals of the dominant monad. Conversely, subordinate monads must have particularly strong bearing on the perceptions of dominant monads, being, as it were, extensions of it. “There is nothing in the world but simple substances, and in them perception and appetite” may sound like a simple statement, but its simplicity should not mask the manifold degrees of coordination between the perceptions and appetites of monads.

e. The Nature of Body

It follows from Leibniz’s idealism that bodies are phenomenal. In other words, the physical world is the perception of perceiving monads. Leibniz is at pains, however, to insist that his system makes bodies “well-founded phenomena” (phenomena bene fundata). By this Leibniz means that bodies are not arbitrary perceptions lacking veracity. The pre-established harmony among all substances establishes a common realm of truth. Our perceptions thus provide us with knowledge of reality and serves as the starting point for empirical science.

Although “well-founded phenomena” might seem an empty expression within an idealist framework, it gains meaning from Leibniz’s commitment to the principle of sufficient reason, that is, the principle that nothing happens without reason or cause. For Leibniz, God’s rational ordering of creation certifies the reliability of sense perception, since God—the most rational of all minds—cannot do anything without having a reason for doing so. It would be arbitrary of God to give me this particular set of perceptions instead of some other set if it were not the case that my perceptions have some basis in other existing substances (NE 56). The thoroughgoing rational design of the world ensures that my perceptions indeed reflect the true order of things.

Defining bodies as “well-founded phenomena” leaves open the question of the relation of an individual’s mind to his own body. After all, my experience of my body seems qualitatively different than my perception of other things in the world. My arm, for example, moves upwards when I wish to remove my hat. Other bodies do not respond to my will in a like manner. Leibniz again invokes his theory of pre-established harmony to explain this apparent interaction between one’s mental and bodily states.

When I wish to raise my arm, it is precisely at the moment when everything is arranged in the body so as to carry this out, in such a manner that the body moves by virtue of its own laws; although it happens through the admirable but unfailing harmony between things that these things conspire towards that end precisely at the moment when the will is inclined to it, since God took it into consideration in advance, when he made his decision about the succession of all things in the universe. (LA 92)

Leibniz explains that God has arranged the world such that one’s mind and body do not directly influence each other, but nevertheless correspond perfectly at all moments. Leibniz is at pains to emphasize that the mind does not directly move the body because he wants to preserve the integrity of physics. Modern physics, relying on the principles of inertia and the conservation of force, requires that the motion of bodies be explained by other bodies. If minds directly influenced bodies, force could be added to the world at any time, and neither the principle of inertia nor the principle of conservation would hold. What causes the motion of my arm are the electrical impulses and synapses of my nervous system. The parallels between our desires and our bodily movements are instances not of interaction, but of harmony.

It is important to note that Leibniz sees the pre-established harmony between mind and body as following from his general theory of substance. Since minds are substantial and bodies phenomenal, my body is in one sense just a particularly distinct perception of my mind. In this sense, one’s perception of one’s body is not qualitatively different from one’s experience of other phenomena. Taking up Leibniz’s description of monads as various “viewpoints” on the universe, perhaps we can liken the body to one’s viewfinder, one’s lens on the universe, so long as we do not take the metaphor too literally by treating the body as an independent substance.

Though Leibniz adopts the language of “well-founded phenomena” to characterize bodies, scholars have debated the extent to which Leibniz’s idealism entails phenomenalism. The debate, put one way, is whether Leibniz makes bodies so “well-founded” that they have more reality than the term phenomena suggests. There is some consensus around the idea that Leibniz does not fully reduce bodies to perceptions, à la Berkeley, since bodies are aggregates of substantially real monads. Less certain is whether the substantial reality of monads makes labeling Leibniz a phenomenalist less apt. Given Leibniz’s insistence that “there is nothing in the world but [incorporeal] simple substances and in them perception and appetite” (AG 181) and his own use of the term phenomena, it seems most likely that Leibniz did not wish to accord bodies of aggregated monads the same metaphysical status as the monads comprising them. In short, monads are substantial, bodies are phenomenal, and Leibnizian idealism entails phenomenalism.

f. Efficient and Final Causality

Leibniz’s retrieval of the notion of substantial form blossomed into his idealist, monadic metaphysics and theory of pre-established harmony. Pre-established harmony mandates that the activity of bodies be explained by other bodies, not by minds. In explaining the activities of bodies, Leibniz makes a second major effort at reconciling ancient and modern thought. He mounts a defense of the utility of final causes in physics.

Aristotle distinguished between four causes, or four ways of accounting for the being of a thing. Philosophers of the 17^th century found particularly objectionable the idea of final cause. The final cause of something indicates its purpose or goal. For instance, one might claim that the final cause of a tree is to grow upwards and reproduce. Thinkers such as Descartes, Hobbes, and Spinoza rejected the utility of final causes in explanations of the physical world, much as they rejected the utility of formal causes, or substantial forms. They restricted physics to the study of efficient causes, mechanical accounts of bodies in motion. We explain the growth of tree by looking to nutrient transfer from roots to branches, the exchange of compounds in respiration, the means of reproduction. To the moderns, any mention of tree’s purpose belongs to poetry, not physics.

Leibniz is as committed to mechanical explanation as his contemporaries, yet he bucks the 17^th century trend of discrediting final causes outright. He reconciles the two approaches by offering a doctrine of double explanation. For Leibniz, events in nature are subject to explanation by either efficient or final causes. Leibniz does not adhere strictly to the Aristotelian notion of final cause any more than he adheres to the Aristotelian notion of substantial form. What Leibniz realizes, however, is that consideration of the end state of a physical process can often have as much predictive power as consideration of the motive forces involved. In §22 of the Discourse on Metaphysics, Leibniz cites Fermat’s proof of the refraction law for light. Fermat derived the law by noting that light takes the easiest path, or the path of least resistance. In this sense, Fermat took note of the end or goal light rays achieve. By contrast, Descartes proved the same law solely by examining efficient causes, likening the refraction of light to bouncing tennis balls, and considering factors such as speed and mass. The refraction of light, Leibniz observes, can be explained and predicted under two separate causal paradigms.

Leibniz’s development of the calculus aids him greatly in his defense of final causes. Using what we would today call the variational calculus, Leibniz can show that change in nature happens at optimal points where the derivative vanishes. Systems thus tend towards certain end states and analyzing these states can furnish us with significant predictive power. Calculus permits Leibniz to tie discussions of final cause to mathematics, not poetics.

Although Leibniz finds both efficient and final causal explanations acceptable, he insists that they be kept separate. We ought not to invoke discussions of purpose simply when we lack a sufficient mechanical explanation. Final causes do not fill the gaps in our understanding of efficient causes; they provide another method of investigation entirely. Leibniz favors explanations by efficient causes, to be sure, as they open up great possibilities for engineering. Still, he considers either method a legitimate account of the world. Efficient causes, Leibniz likes to say, show us God’s power; final causes, by bringing to light the directedness and efficiency of nature, reveal God’s wisdom.

4. Theodicy

a. Leibniz’s Project

Leibniz ranks peace of mind as “the greatest cause of [his] philosophizing” (L 148). Central to Leibniz’s efforts to secure peace of mind is the thesis that we live in the best of all possible worlds, a position now commonly called Leibnizian optimism. Leibniz reasons that if we can assure ourselves that God acts in the best of all possible ways, then we can trust God’s justice and have true peace of mind. Of course, it is by no means self-evident that our world, which includes suffering and evil, is compatible with divine justice, nor is it self-evident what criteria could certify the world as “the best of all possible.” Leibniz thus devotes much argument to defending divine justice and coins the term “theodicy”—from the Greek words for God (theós) and justice (díkē)—to describe this project.

b. God

The thesis that God acts in the best of all possible ways follows from the notion of God as “an absolutely perfect being” (DM 1). Leibniz accepts Descartes’ ontological proof for the existence of God, which proves the existence of God by way of our idea of perfection, with one caveat. To Leibniz, Descartes leaves his proof open to the objection that God does not exist because God cannot exist. “An absolutely perfect being,” this objection posits, is a logical impossibility. So, Leibniz sets out to demonstrate that a single being can possess all perfections in a logically consistent manner. He bolsters the ontological proof by grounding the demonstration for God’s actuality in a demonstration of God’s possibility.

Leibniz clarifies what he means by “perfection” by stipulating that those properties incapable of a highest degree do not qualify as perfections. The “greatest of all numbers” is a contradiction, as is the “greatest of all figures,” since number and magnitude are infinitely continuous quantities. However, there is nothing inherently contradictory in “the highest degree of knowledge” or “the highest degree of power,” so omniscience and omnipotence are rightly considered divine perfections (DM 1). We can say a being possesses limitless knowledge and power without predicating meaningless, impossible attributes of God. Importantly for the purposes of an ontological proof, existence qualifies as perfection under Leibniz’s definition.

Leibniz argues for the compatibility of all perfections by further stipulating that by “perfection” he means a simple, positive quality (L 167). Once we recognize that perfections are simple qualities, Leibniz believes we easily arrive at the conclusion that there is nothing inherently contradictory in the idea of a perfect being. For, were two perfections incompatible, this fact would be evident either immediately or through an analysis of the perfections in question. In the case of perfections like knowledge and power, no immediate incompatibility presents itself. Yet, because these qualities are simple, they cannot be broken down into components which might be shown incompatible. Since the incompatibility of perfections can be shown neither in itself, nor through demonstration, Leibniz concludes that God is a logically possible being. And—following the logic of the ontological proof—if possible, God is necessary.

Leibniz does not disallow other, a posteriori proofs for God’s existence. To the contrary, he employs several such proofs in his writings. Since it turns so much on the idea of perfection, however, his defense of the ontological proof holds a special place in his theodicy and thus in his philosophy as a whole.

c. Possible Worlds and Optimism

As an absolutely perfect being, God acts in the most perfect fashion. To understand what this means for an account of creation and a defense of God’s justice, Leibniz turns to the idea of possible worlds. A possible world is any set of possible substances whose attributes are mutually consistent, or compatible, with one another. Monads whose mutual existence would not entail contradictions are said to be compossible and thus potential members of a common world. God, in his omniscience, surveys an infinite number of compossible sets of substances and chooses to create the optimal, or best possible, world

What characterizes the best possible world? By what criteria does God make his selection? In the Discourse on Metaphysics, Leibniz writes that God selects that world which most effectively balances simplicity of means with richness of effects (DM 5). He likens God to a skilled architect who best employs the space and resources available to him, or a skilled geometer who finds the most elegant solution to a problem. Simplicity of means requires that there be order, efficiency, continuity, and intelligibility in the world. Richness of effects requires the maximization of both metaphysical and moral goodness. Metaphysical goodness denotes the amount of essence or perfection in the world, in short, the extent to which various creatures in the world imitate God’s inexhaustible essence. Maximizing metaphysical goodness therefore requires, at the very least, the creation of a great variety of creatures. Moral goodness refers to the happiness of rational beings, particularly the perfection and advancement of their rational faculties.

Much scholarship is devoted to determining precisely how Leibniz sees richness and simplicity coinciding in the best possible world. The task of interpretation gains complexity from the fact that Leibniz also speaks of God optimizing beauty and harmony, and even at times suggests that the best possible world progresses continually in perfection over time. Despite the difficulties in interpretation, it is clear that at the very least rational beings must inhabit an intelligible world. The perfections of rational beings interfere with one another least and thus are maximally compossible. Rarely does the knowledge and virtue of one person prevent or disallow the knowledge and virtue of another. By contrast, the beauty of a mountain range does preclude the beauty of plains at a given space and time. Because rational beings are capable of knowing God and entering into relationship with him, they are most responsible for maximizing metaphysical and moral goodness in the world. The intelligible order of creation aids them in this by making knowledge of various phenomena accessible through simple hypotheses.

Crucially, the existence of suffering does not count as proof against our world as being the best possible. By Leibniz’s lights, the goodness of the world as a whole does not require that each aspect of the world be choice worthy in itself. Pain and suffering find their place in the best possible world as “necessary evils” in maximizing its overall goodness. Here, the question of God’s justice arises and the true importance of possible worlds for Leibniz’s theodicy comes to light. How can God will to create pain and suffering? Does creating these not compromise divine justice? Leibniz responds that the divine will desires only what is good. The divine intellect takes, as it were, this desire for the good and determines how best to actualize it. The construction of the best possible world is the work of the divine intellect, and no more a matter of God’s will than the solution to an algebra equation depends on my will. God, Leibniz asserts, antecedently wills the good and consequently wills the best. God never wills evils in themselves, and never compromises his perfection, goodness, or justice. He accepts evil and suffering only insofar as they contribute to the overall goodness of the best possible world.

The distinction between what follows from the divine will and what follows from the divine intellect ultimately provides Leibniz with a means of upholding God’s perfection, despite the imperfections of creation. Were the conditions of the optimal world determined not by the divine intellect, but rather by arbitrary fiat, God would be no more than a despot and we would have no objective standard by which to judge his actions best. Were pain and suffering objects of the divine will per se, God would be cruel and unworthy of love. In other words, Leibniz believes he safeguards divine perfection by explaining that God is neither injudicious in thought nor vicious in will in creating the world as it is. Thus, assuring ourselves of God’s goodness and perfection is vital because “one cannot love God without knowing his perfections” (T 54) and loving God provides more happiness and peace of mind than any other activity. “To love is to find pleasure in the happiness of another. We love God himself above all things because the pleasure which we experience in contemplating the most beautiful being of all is greater than any conceivable joy” (L 134).

Leibniz insists that his optimism provides grounds for true joy and peace of mind, not simply the kind of disaffected, “grin and bear it” acquiescence commonly associated with the Stoics and—as Leibniz sees it—championed by Spinoza and Descartes. God does not what he must, but what is best. Whether or not Leibniz offers any greater consolation than the Stoics is an open question. Yet Leibniz believes that even if one cannot see the purpose of suffering, one can gain some measure of joy by contemplating, and advancing in knowledge of, God’s perfection.

Furthermore, because the theory of pre-established harmony among substances requires that all monads be created or destroyed collectively, Leibniz defends the immortality of monads. What we consider “life” is an active state of perception and appetite; what we consider “death” is simply dormancy. Leibniz, not unlike other Christian thinkers before him, maintains the hope that God will compensate for evils suffered by individuals over the full course of their existence, even if the purpose of those evils is not evident during their natural lifespans.

d. Freedom and Necessity

Leibniz’s theodicy raises two weighty sets of questions regarding freedom. The first concerns God’s freedom in creating. If the divine intellect objectively determines the design of the best possible world, should we not conclude that God is determined to create just this world? Is the notion of the divine will not meaningless, compromising the theological concept of grace? The second set of questions concerns human freedom. Since each individual substance contains all that can ever be predicated of it, and since God surveys the activity and interrelations of all monads in selecting the best possible world, it would seem that the entire course of history is set before the creation of the world. Does this mean that the idea of free will—and along with it theological concepts such as sin and redemption—is meaningless?

Leibniz takes these questions seriously throughout his career. His reflections trace at least to his Confession of a Philosopher of 1672-3. Section 13 of 1686’s Discourse on Metaphysics, which explores freedom and necessity, spurs his lengthy correspondence with Antoine Arnauld. And in the Theodicy of 1710, Leibniz calls the “labyrinth of freedom and necessity” one of the most perplexing questions facing humankind.

Though far from the first thinker to confront this “labyrinth,” Leibniz’s original contribution lies in his distinction between two kinds of necessity. Truths whose contraries imply a contradiction Leibniz calls “necessary per se.” Among these truths governed by the principle of non-contradiction, Leibniz includes the laws of arithmetic, geometry, and logic. Because these truths cannot be otherwise, not even to the divine intellect, Leibniz posits that they hold in all possible worlds. He thus refers to propositions necessary per se as “eternal verities.”

Truths which are certain, but whose contrary does not imply contradiction, Leibniz terms “necessary ex hypothesi.” The sequence of events in the world is necessary in this way. It is logically possible to conceive of the world being otherwise than it is. We create fictionalized accounts of reality in novels and dramas all the time; these accounts are entirely consistent in themselves. Because events in the world can be imagined otherwise, Leibniz believes they are in themselves contingent (contingent per se). Nevertheless, events in the world necessarily happen as they do on the presumption of (ex hypothesi) God’s selection of the best possible world. While the created world could be otherwise than it is, the optimal world could not be. Truths necessary ex hypothesi are governed by the principle of sufficient reason: God has a reason, a cause for creating the world in this way, namely, his desire for the best.

Leibniz locates a second method of distinguishing truths necessary per se from truths contingent per se in their respective manners of demonstration. The truth of a claim necessary per se, Leibniz writes, can be demonstrated a priori in a finite analysis, a proof with a finite number of steps. Think of Euclid’s demonstrations of the principles of geometry. Proving the truth of a contingent proposition, by contrast, requires an infinite analysis. To explain a priori why a given proposition about the world is true, one would have to take into account its harmony with all the other substances in the world, as well as account for why this set of substances was chosen out of the infinite number of possible worlds. Explanation would literally proceed ad infinitum. This is not to say that contingent truths are unknowable. God’s infinite intellect can presumably handle an infinite analysis and we know contingent truths a posteriori through experience. Infinitude of an analysis is a formal property of certain demonstrations, one Leibniz thinks suffices to distinguish necessary ex hypothesi from necessary per se truths.

With the distinction between the two kinds of necessity, Leibniz attempts to maintain meaningful notions of both divine and human freedom. Since God has infinitely many options among possible worlds, he cannot be said to be required in creating. One might object that God’s benevolent nature constrains and determines his action by forcing God to select the best world his intellect can design. Leibniz, however, counters that acting in accord with one’s nature and for the sake of the best is true freedom. One is only determined when constrained by outside forces. That God’s own nature leads him to create the best from among possible worlds makes him all the more free and worthy of praise.

Whether Leibniz is licensed to speak of human freedom is a thornier issue. Kant, in his Critique of Practical Reason, famously scoffs that Leibniz grants human beings nothing more than “the freedom of a turnspit” which, “once it is wound up, also accomplishes its movements of itself” (I.3; 5:97). Kant reasons that Leibniz’s monads, like any good machine, simply execute what they are programmed to do. To an extent, Kant is right. Leibniz does not entertain a notion of “free will,” if by this one means arbitrary and completely undetermined choice. The principle of sufficient reason banishes arbitrary choice. Human beings act in accord with their own natures, choosing what they deem best. My individual essence provides the reason for what I do

Yet while rejecting a voluntarist conception of free will, Leibniz nevertheless speaks of human freedom. We might reconstruct Leibniz’s reasoning in three steps. First, with the modal distinction between the two kinds of necessity, Leibniz insists that human choices are not necessary in the strong sense. Each truth about monads and their history is logically contingent. Leibniz, therefore, is not a logical determinist. He is however, an ontological determinist, insofar as all events are necessary given the composition of the world. Nevertheless—and this is the second step—the fact that each substance is causally independent of all other created substances makes each monad spontaneous. Spontaneity, to reiterate, refers to the fact that each state of a created substance follows from its preceding state without the direct influence of other substances; in this sense, each substance is “free.” Still, spontaneity is not what most people mean by human freedom. Human freedom—step three—comes with the fact that rational beings can gain knowledge of the causal principles governing the sequence of events in the world. Acting with knowledge does not make one less determined, but does make one less passive. One feels less at the mercy of inalterable forces when one understands these forces and can appreciate the principles of God’s design. The idea that increased activity and knowledge make an individual free owes much more to the conception of freedom developed by the Stoics and revived in the 17^th century by Spinoza than it owes to voluntarist and Protestant conceptions of free will. As Leibniz sees it, his is the only conception of freedom compatible with divine perfection and worldly optimism.

5. Epistemology

a. Ideas and Knowledge

Leibniz’s epistemology begins with the distinction between clear and obscure ideas. An idea is clear when it allows one to recognize the thing represented, obscure when it does not. For example, one may have seen a gerbil and thus have an idea of what a gerbil is. However, if the next time she encounters a small rodent she cannot tell whether it is a gerbil or a hamster, then she possesses only an obscure idea of “gerbil.” By contrast, when one’s idea suffices to reliably distinguish one kind of object from others, then the idea is clear.

Leibniz divides clear ideas into two classes: confused and distinct. A clear idea is also distinct when one can catalogue all the marks, or criteria, distinguishing that idea from others. The animal physiologist can differentiate and enumerate those characteristics common to all rodents and those unique to gerbils. A child with a pet gerbil might not be able to do so and thus would have a clear but confused idea. Leibniz believes our sensory ideas, such as those of color, are clear and confused. Though we reliably distinguish blue from red, we cannot necessarily spell out the marks or causes which make one object blue and another red. We perceive colors without explaining them.

Leibniz proceeds to further classify clear and distinct ideas as either adequate or inadequate. If possessing an adequate idea, one has clear and distinct knowledge not only of the idea in question, but also of all its component parts. One has clear and distinct knowledge “all the way down” to the primitive concepts which compose the idea. Leibniz admits that he is unsure if any human being possesses an adequate idea, but believes our arithmetical knowledge most nearly approaches adequacy. In all other cases, where one cannot carry out comprehensive analyses down to primitive concepts, one has clear, distinct, yet inadequate ideas.

At its highest reaches, knowledge is not only adequate, but also intuitive. Intuitive knowledge is both adequate and non-discursive. That is, one clearly and distinctly knows all the ingredients of an idea and grasps these simultaneously. As is the case with all adequate knowledge, intuitive knowledge seems more suited to divine knowers than to human knowers, as the latter cannot think about all the components of a complex concept at once.

One consequence of Leibniz’s taxonomy of knowledge is that it provides Leibniz with a means of explaining sense perception. Given Leibniz’s idealism, all that exists in the world are monads and their mental states. Bodies are phenomenal and therefore not sources of knowledge. What, then, is sense perception? Is there any real difference between sensation and intellection if all ideas follow spontaneously from a monad’s own concept, with no interaction between monads? Leibniz answers such questions by noting that what we commonly experience as sense perceptions are simply confused ideas. Even if they are clear, sense perceptions are necessarily confused. Though these perceptions arise spontaneously in the perceiving subject, they express the harmony between a given monad and all others; it is therefore impossible to enumerate all the contributing factors to any given sense perception, most of which fall below the threshold of consciousness (DM 33). With the category of clear and confused ideas, Leibniz can meaningfully retain the distinction between sensation and intellection without compromising the basic tenets of his idealism.

Leibniz’s approach to ideas and knowledge separates him in some key respects from his fellow 17^th century rationalists. The division between distinctness and adequacy leads Leibniz to differentiate between nominal and real definitions. Nominal definitions include distinct knowledge; they sufficiently identify the defining marks of a concept. Yet they do not ensure that the concept is possible. It could be that a concept is internally inconsistent, a fact which would be revealed if one had adequate knowledge of all its parts. Real definitions account for the possibility of a thing, either by citing experience or through a priori demonstration. In his discussion of definition, Leibniz seeks to modify Hobbes’ strong nominalism in which all truth is dependent on the relationship between names and definitions. There is a higher level of knowledge than that contained in nominal definitions, one which accounts for possible existence in reality.

Hobbes is not Leibniz’s only rationalist target. Leibniz believes he improves upon Descartes’ maxim that all clearly and distinctly perceived ideas are true by delineating better criteria for clarity and distinctness. To Leibniz, Descartes construes clarity and distinctness as something like immediately perceived qualities, ripe for misevaluation.

b. Innate Ideas

In the New Essays on Human Understanding, Leibniz takes aim at Locke’s depiction of the mind as a tabula rasa, or blank tablet, needing external impressions to furnish it with the contents of its reasoning. In opposition to this conception of the mind and cognition, Leibniz affirms the existence of innate ideas. In one sense, Leibniz’s theory of substance obviously commits him to some conception of innate ideas. If monads have no “windows” through which they interact with other substances, then of course all their ideas must have an internal, innate origin.

But Leibniz does not rest his defense of innate ideas on his theory of substance. Rather, he advances fairly traditional epistemological arguments regarding the nature of deductive, a priori truths. Empirical knowledge can show that something is the case but cannot show that something is necessarily the case. The human mind, however, has knowledge of necessary truths, such as the laws of arithmetic and geometry. These necessary truths, which Leibniz calls “truths of reason,” are ideas whose opposite is impossible. They are the eternal truths which obtain in all possible worlds. Because truths of reason are known solely through the principle of non-contradiction and require no empirical data, Leibniz concludes that they are innate to the mind. Leibniz contrasts innate ideas with “truths of fact,” contingent truths whose opposite is possible and knowledge of which requires experience.

The theory of innate ideas does not imply that all minds have equal awareness of the truths of reason. Ideas are innate in us not as actualities, but “as inclinations, dispositions, tendencies, or natural potentialities” (NE 52). Accessing truths of reason requires effort. Yet the presence of innate ideas does incline us towards their discovery. In one particularly apt metaphor, Leibniz claims that rational minds are not like blank tablets, but like veined pieces of marble, disposed to be cut and polished in determinate ways.

c. Petites Perceptions

One of the more original elements of Leibniz’s epistemology is his theory of petites perceptions.

There are hundreds of indications leading us to conclude that at every moment there is in us an infinity of perceptions, unaccompanied by awareness or reflection; that is, of alterations in the soul itself, of which we are unaware because these impressions are either too minute and too numerous, or else too unvarying, so that they are not sufficiently distinctive on their own. But when they are combined with others they do nevertheless have their effect and make themselves felt, at least confusedly, within the whole. (NE 53)

Leibniz posits that at any given time, the mind has not only the thoughts of which it is aware, but also innumerable small, insensible perceptions, which he calls petites perceptions.

Leibniz wagers that there are “hundreds of indications” pointing to existence of petites perceptions. Regardless of whether this is hyperbole, there are at least a few good reasons Leibniz includes these perceptions in his theory. For one, petites perceptions follow from the theory of pre-established harmony, both the harmony between all substances and the harmony between mind and body. Each monad mirrors the activity of all others at all moments. This mirroring takes place via mutual representation. Since no mind, at any given moment, has conscious awareness of all other substances, mutual representation must be taking place at insensible levels via petites perceptions. Moreover, the pre-established harmony between mind and body requires that mental activity express and run parallel to bodily activity. However, one is often insensitive to one’s bodily processes. In order to maintain the perfect parallelism between body and mind, therefore, we must conclude that the mind has petites perceptions of the body’s activity.

Even more fundamentally, the existence of petites perceptions follows from Leibniz’s understanding of substance. It is of a piece with the thesis that “there is nothing in the world but simple substances and in them perception and appetite.” Activity, more specifically perception, is the mark of any substance. That the mind has petites perceptions explains how it remains active and substantial even in dreamless sleep or after death.

Petites perceptions also help to explain the workings of appetite. Appetite determines the transition from one perception to the next, a transition which oftentimes seems sudden and episodic. For instance, one might jump immediately from thinking of one’s mother to thinking of Beethoven’s fifth symphony. On its face, this transition violates the principle of continuity, which states that no discontinuous change occurs. Nature—including rational nature—makes no leaps, has no gaps. The theory of petites perceptions accounts for apparent leaps in perception. What appears a discontinuous change in thought is actually determined by the continuous workings and interactions of infinitely many insensible perceptions.

Finally, petites perceptions help to explain what is confused in a confused idea, particularly in sense perceptions. The difficulty in explaining all the marks of a sensation comes from the many petites perceptions which contribute to it. “These minute perceptions…constitute that je ne sais quoi, those flavors, those images of sensible qualities, vivid in the aggregate but confused as to the parts; those impressions which are made on us by the bodies around us and which involve the infinite; that connection each of us has with the rest of the universe” (NE 54-5).

d. Reflection, Memory, Selfhood

All substances are incorporeal and perceptive. For this reason, Leibniz understands all substances on analogy to human minds or souls. Leibniz reserves the proper use of the term “soul,” however, for higher order substances with particular cognitive capacities. Souls not only perceive, but also apperceive. That is, they not only perceive objects, but also think about and reflect on themselves. They have the added capacity to remember past perceptions. These abilities to reflect and remember provide souls with a sense of self, an understanding of the “I.” As a result, souls have moral identities. Moral identity goes beyond the substantial identity over time that all monads have; moral identity requires that one can remember his past actions, recognize himself as the selfsame individual over time, and therefore assume responsibility for his character.

Reflection and memory make souls not just moral beings, but intellectual beings as well. Leibniz observes that self-reflection serves as the starting point for all metaphysical and philosophical thinking. Each soul is, as it were, its own principal innate idea. Studying one’s own nature leads one to form and investigate fundamental metaphysical ideas. “In thinking of ourselves, we think of being, of substance, of the simple and the composite, of the immaterial, and of God himself, by conceiving that that which is limited in us is limitless in him. And these reflective acts furnish the principle objects of our reasonings” (M 30).

Because of their moral and intellectual capacities, Leibniz likens souls to “little divinities” (M 30). Leibniz expresses the near divinity of rationality rather poignantly in the Theodicy:

This portion of reason which we possess is a gift of God and consists in the natural light that has remained with us in the midst of corruption; thus it is in accordance with the whole, and it differs from that which is in God only as a drop of water differs from the ocean, or rather as the finite from the infinite. (T 169)

Though every substance reflects God and his plan for the cosmos, rational souls are mirrors of God in a heightened way, being able to understand the nature of things, reflect on God’s works, and ultimately enter into relationship with him (M 83-84).

6. Ethics

Of the traditional major content areas of philosophy, ethics is perhaps the only one to which Leibniz is generally not considered to have made significant contribution. Certainly he does not share the reputation as an ethicist enjoyed by early modern thinkers Spinoza, Hume, and Kant, nor does he share the influence in political philosophy had by Locke and Hobbes. Leibniz himself, however, took great interest in the ethical dimensions of his thought. He engaged in central debates of the day regarding the foundations of justice and the possibility of altruistic love. Furthermore, all his thinking has a clear ethical bent, with the peace of mind sought by his optimism a prime example of this. While Leibniz’s ethical contributions do not match his metaphysics in scope or originality, when it comes to a thinker as singular as Leibniz, this fact alone should not discourage inquiry into his ethics.

a. Intellect and Will

Leibniz’s approach to ethics is, broadly speaking, intellectualist in nature. That is, Leibniz sees moral goodness as increasing in line with knowledge. He defines will as “the inclination to do something in proportion to the good it contains” (T 139). Hence, the more knowledge one has of the goodness of a particular object or act, the better one’s will is directed. Loving and desiring the right kinds of things follows from proper understanding. Perfecting the intellect, in short, accomplishes the perfection of the will.

Perfecting the intellect also brings about happiness. “It is obvious,” Leibniz writes, “that the happiness of mankind consists in two things—to have the power, as far as permitted, to do what it wills and to know what, from the nature of things, ought to be willed. Of these, mankind has almost achieved the former; as to the latter, it has failed in that it is particularly impotent with respect to itself” (L130). Despite Leibniz’s dour diagnosis of humanity’s understanding of perfection, his prognosis is encouraging. He does not see happiness as particularly difficult to achieve. One need only pursue and acquire knowledge of the nature of things.

The close alliance Leibniz sees between intellect and will has the further consequence of ruling out indifference of equipoise, a topic of much debate in the 17^th century At issue in discussions of this “indifference” is the question of whether one’s will can be in complete suspension when faced with two or more options, without inclination one way or another. The purported phenomenon of indifference of equipoise was taken at the time as evidence of the will’s independence from the intellect and even of its capacity for free, uncaused choice.

Leibniz rejects indifference of equipoise on grounds of the principle of sufficient reason. Uncaused events are incomprehensible; all events, including acts of the will, have some explanation. Here the deeper significance of Leibniz’s account of the will comes to light: one’s knowledge of the goodness of things provides the reason the will chooses as it does. Still, one might ask, could not the will be in equilibrium when faced with two objects of equal goodness? No. Per the principle of the identity of indiscernibles, each substance in the world has a unique complete concept which mirrors God and creation in a unique way; no two substances, no two states of affairs, are equivalent in goodness. One’s intellect and will therefore cannot respond identically to two different options. Though we may sometimes feel completely indifferent and unable to articulate the reasons for a choice, Leibniz insists that it would be a mistake to think of the choice as uncaused or of the will as uninclined. Infinitely many petites perceptions are at work in one’s mind at all times; much like machines, our movements are the result of all the tendencies and inclinations within us, even those of which we are unaware. Thus, we should not champion arbitrary choice by citing indifference of equipoise, but rather become freer, more self-aware moral beings through progress in knowledge.

b. Justice and Charity

Leibniz sees the study of justice as an a priori science of the good. There is, that is, an objective, rational basis for justice. Though Leibniz wrote much regarding the positive laws of states, he does not see positive law as the foundation of justice. He rejects the position that justice has no firmer foundation than the fiat of those in power, a position Leibniz often mentions in conjunction with Thrasymachus from Plato’s Republic but more pointedly associates with Samuel von Pufendorf and Thomas Hobbes. Taken to its logical conclusion, this position results in divine command theory: certain principles are just simply because God, the most powerful of all legislators, has posited they be so. For Leibniz, this line of thinking violates God’s perfection. God acts in the most perfect way and thus acts with good reason, not by arbitrary fiat. He is perfect not only in power, but also in wisdom. God’s perfect will follows upon his perfect intellect no less than the will of any rational being follows upon her intellect. The a priori, eternal standard of justice to which God himself adheres provides the basis for a theory of natural law.

Leibniz defines justice as the charity of the wise person. Though this may seem unique, or even odd, to those accustomed to seeing justice and charity contrasted, what is truly original in Leibniz’s rooting justice in charity is his very definition of charity, or love. In the 17^th C., there were a series of debates regarding the possibility of disinterested love. Each creature, it would seem, acts to preserve and advance its own being. Hobbes and Spinoza employed the term conatus to refer to the striving each being has to persist in its own being and made it the foundation of their respective psychologies. On this view, one loves what one finds pleasing, that is, what one finds conducive to his own persistence. Love is reduced to a kind of egoism which, even where benevolent, nevertheless lacks an altruistic component.

Leibniz attempts to obviate the tension between egoism and altruism by defining love as taking pleasure in the happiness, or perfection, of another. With this definition, Leibniz does not deny the fundamental drive all creatures have for pleasure and self-interest, but ties it to altruistic concern for the well-being of others. The coincidence of altruism and self-interest defines love and captures the essence of justice. Justice is the charity of the wise person and the wise person, Leibniz goes on to say, loves all. Leibniz’s basic contention is that to be just is to show the love attended by insight that God shows. Ethics involves seeking the good of all in a prudent way, such that the good of each individual is pursued only insofar as it is compatible with the whole. One cannot love all when obtaining the happiness of one person at the expense of another’s, nor would this be desirable, since Leibniz believes we find more pleasure in harmony than discord. The kind of universal love demanded by Leibniz’s definition of justice is nurtured by reflection on the universal harmony between all things. Leibniz believes that appreciating the harmonious order of the cosmos can lead individuals to find pleasure in increasing the perfection and happiness of all who share in that order.

Leibniz’s definition of love also entails that loving God is the highest end of rational beings. If love is finding pleasure in the perfection of another, then loving an infinitely perfect being affords the greatest possible pleasure and happiness.

To love is to find pleasure in the happiness of another. We love God himself above all things because the pleasure which we experience in contemplating the most beautiful being of all is greater than any conceivable joy. (L 134)

Since the harmony of the world mirrors God’s perfection, Leibniz’s conception of justice does not place love of God at odds with love of others. We should take pleasure in perfection wherever we discern it. Justice as the charity of the wise person means that love of God and love of neighbor are one. By identifying justice with love of God and harmony between all, Leibniz brings to fruition the ethical implications of his metaphysical inquiries into God’s perfection and pre-established harmony. Ethics and metaphysics are, for Leibniz, never far apart.

7. References and Further Reading

a. Primary Sources: Leibniz Texts and Translations

The standard critical edition of Leibniz’s writings is G.W. Leibniz: Sämtliche Schriften und Briefe, edited by the Deutsche Akademie der Wissenschaften (Berlin: Academy Verlag, 1923- ). The Akademie edition is still in production. Other useful editions of Leibniz’s writings in their original languages are those of C. I. Gerhardt (Die Philosophischen Schriften von Leibniz. 7 vols. 1875-1890) and Ludovici Dutens (Gottfried Wilhelm Leibniz: Opera Omnia. Hildesheim: Georg Olms Verlag, 1989).

References in this article to Leibniz’s works use the following abbreviations and translations:

AG G.W. Leibniz: Philosophical Essays. Edited and translated by Roger Ariew and Daniel Garber. Indianapolis: Hackett, 1989.

DM Discourse on Metaphysics, as translated by Ariew and Garber in G.W. Leibniz: Philosophical Essays. Passages from the Discourse are cited by section number.

G Die Philosophischen Schriften von Leibniz. Edited by C.I. Gerhardt. Berlin. 7 vols. 1875-1890.

L G.W. Leibniz: Philosophical Papers and Letters. Edited and translated by Leroy E. Loemker. 2nd ed. Dordrecht: Kluwer, 1989.

LA The Leibniz-Arnauld Correspondence. Edited by H.T. Mason. Manchester: Manchester University Press, 1967.

M Monadology, as translated by Ariew and Garber in G.W. Leibniz: Philosophical Essays. Passages from the Monadology are cited by section number.

NE New Essays on Human Understanding. Edited by Peter Remnant and Jonathan Bennett. Cambridge: Cambridge University Press, 1996.

T Theodicy: Essays on the Goodness of God, the Freedom of Man, and the Problem of Evil. Translated by E.M. Huggard. BiblioBazaar, 2007.

Other helpful collections of Leibniz’s writings in English include:

The Leibniz-Clarke Correspondence. Edited by H.G. Alexander. New York: Philosophical Library, 1956.
The Labyrinth of the Continuum: Writings on the Continuum Problem, 1672-1686. Edited by Richard W. T. Arthur. New Haven: Yale University Press, 2002.
The Leibniz-Des Bosses Correspondence. Edited by Brandon C. Look and Donald Rutherford. New Haven: Yale University Press, 2007.
Leibniz: Logical Papers. Edited by G.H.R. Parkinson. Oxford: Clarendon Press, 1966.
De Summa Rerum: Metaphysical Papers, 1675-1676. Edited by G.H.R. Parkinson. New Haven: Yale University Press, 1992.
Leibniz: Political Writings. Edited by Patrick Riley. Cambridge: Cambridge University Press, 1988.
Confessio Philosophi: Papers Concerning the Problem of Evil, 1671-1678. Edited by Robert C. Sleigh, Jr. New Haven: Yale University Press, 2005.
Leibniz and the Two Sophies: The Philosophical Correspondence. Edited by Lloyd Strickland. Toronto: Iter, Inc., 2011.
Leibniz’s ‘New System’ and Associated Contemporary Texts. Edited by R.S. Woolhouse and Richard Francks. Oxford: Clarendon Press, 1997.

b. Secondary Sources

i. Introductory Texts

Antognazza, Maria Rosa. Leibniz: An Intellectual Biography. New York: Cambridge University Press, 2009.
Arthur, Richard T.W. Leibniz. Cambridge: Polity Press, 2014.
Jolley, Nicholas. Leibniz. New York: Routledge, 2005.
Perkins, Franklin. Leibniz: A Guide for the Perplexed. New York: Continuum, 2007.
Savile, Anthony. Routledge Guidebook to Leibniz and the Monadology.New York: Routledge, 2000.

ii. More Advanced Studies

Adams, Robert Merrihew. Leibniz: Determinist, Theist, Idealist. New York: Oxford University Press, 1994.
Garber, Daniel. Leibniz: Body, Substance, Monad. New York: Oxford University Press, 2009.
Ishiguro, Hidé. Leibniz’s Philosophy of Logic and Language. Ithaca: Cornell University Press, 1975.
Mercer, Christia. Leibniz’s Metaphysics: Its Origins and Development. New York: Cambridge University Press, 2001.
Parkinson, G.H.R. Logic and Reality in Leibniz’s Metaphysics. Cambridge: Oxford University Press, 1965.
Rescher. Nicholas. Leibniz’s Metaphysics of Nature. Dordrecht: Reidel, 1981.
Riley, Patrick. Leibniz’s Universal Jurisprudence: Justice as the Charity of the Wise. Harvard University Press, 1996.
Rutherford, Donald. Leibniz and the Rational Order of Nature. New York: Cambridge University Press, 1995.
Sleigh, Robert C. Leibniz and Arnauld: A Commentary on their Correspondence. New Haven: Yale University Press, 1990.
Smith, Justin E.H. Divine Machines: Leibniz and the Sciences of Life. Princeton: Princeton University Press, 2011.
Strickland, Lloyd. Leibniz Reinterpreted. London: Continuum, 2006.
Wilson, Catherine. Leibniz’s Metaphysics: A Historical and Comparative Study. Princeton: Princeton University Press, 1989.

iii. Collected Essays

Brown, Stuart, ed. The Young Leibniz and his Philosophy (1646-76). Dordrecht: Kluwer, 1999.
Jorgensen and Newlands, eds. New Essays on Leibniz’s Theodicy. Oxford: Oxford University Press, 2014.
Jolley, Nicholas, ed. The Cambridge Companion to Leibniz. edited by Nicholas Jolley. Cambridge: Cambridge University Press, 1995.
Rutherford and Cover, eds. Leibniz: Nature and Freedom. New York: Oxford University Press, 2005.

Author Information

Edward W. Glowienka
Email: eglowienka@carroll.edu
Carroll College
U. S. A.

Resource Bounded Agents

Resource bounded agents are persons who have information processing limitations. All persons and other cognitive agents who have bodies are such that their sensory transducers (such as their eyes and ears) have limited resolution and discriminatory ability; their information processing speed and power is bounded by some threshold; and their memory and recall is imperfect in some way. While these general facts are not controversial, it is controversial whether and to what degree these facts should shape philosophical theorizing.

Arguably, resource bounded agents pose the most serious philosophical challenges to normative theories in a number of domains, and especially to theories of rationality and moral action. If a normative theory endorses a standard for how an agent ought act or think, or if a normative theory aims to provide recommendations for various kinds of conduct, such a theory will have commitments regarding the descriptive facts about the agent’s cognitive limitations. There are two major responses. These theories may either (1) argue to dismiss these descriptive facts as irrelevant to the normative enterprise (see section 2) or, instead, (2) attempt to accommodate these facts in some way (see section 3). Historically, normative theories that have attempted to accommodate facts about cognitive limitations have done so by either (i) augmenting the proposed normative standard, or (ii) using facts about cognitive limitations to show that agents cannot meet the proposed normative standard.

After a brief discussion of some empirical work addressing human cognitive limitations, this article will discuss idealization in philosophy and the status of the normative bridge principle “ought implies can,” which suggests that “oughts” are constrained by descriptive limitations of the agent. Next, the article explores several theories of rationality that have attempted to accommodate facts about cognitive limitations.

As an introductory and motivating example, consider the claim that human agents ought not to believe inconsistent propositions. Initially, such a claim seems perfectly reasonable. Perhaps this is because a collection of inconsistent propositions is guaranteed to include at least one false proposition. But Christopher Cherniak (1986) has pointed out that when one has as few as 140 (logically independent) beliefs, there are approximately 1.4 tredecillion (a number with 43 digits) pairs of beliefs to check for potential inconsistency. No human could ever check that many items for consistency. In fact, an ultra-fast supercomputer would take 20 billion years to complete such a task. Hence, for some epistemologists, the empirical fact of the impossibility of a complete consistency-check of a human’s belief corpus has provided reason for thinking that complete consistency of belief is not an appropriate normative standard. Whether such a response is ultimately correct, however, concerns the status of resource bounded agents in normative theorizing.

Cognitive Limitations and Resource Bounds
Idealization
Accommodating Cognitive Limitations
1. Changing the Normative Standard
2. Failing to Meet the Standard
  1. Kahneman and Tversky’s “Heuristics and Biases” Program
References and Further Reading
1. References
2. Further Reading

1. Cognitive Limitations and Resource Bounds

Every known cognitive agent has resource and cognitive limitations. Christopher Cherniak refers to this necessary condition as the “finitary predicament”: because agents are embodied, localized, and operate in physical environments, they necessarily face informational limitations. While philosophers have acknowledged this general fact, the precise details of these resource and cognitive limitations are not widely discussed, and the precise details could matter to normative theorizing. Revisiting the example from above, it is obvious that humans cannot check 1.4 tredecillion pairs of beliefs for consistency. But it is not obvious how many beliefs a human agent can check. If it could be experimentally demonstrated that humans could not occurrently check twelve beliefs for consistency, even this minimal consistency check might not be rationally required. Hence, the precise details of cognitive limitations need to be addressed.

Before turning to the details of cognitive limitations, it is important to note that there are two senses of the term ‘limitation’. To see the distinction, consider a simple example. Very young children are limited in their running abilities. This limitation can be described in two ways: (i) young children cannot run a mile in under four minutes, and (ii) young children are not excellent runners. The important difference in these (true) descriptions is that way (i) uses non-normative language and way (ii) uses normative language. This distinction is crucial when the main objective is an evaluation of the normative standard itself. For instance, challenging whether (i) is true involves non-normative considerations while challenging whether (ii) is true fundamentally involves normative considerations. As such, the kinds of cognitive limitation under discussion in this article will primarily concern non-normative limitations.

In what follows, this article will survey some findings from cognitive psychology to illustrate various attempts to measure human cognitive limitations. These findings are not exhaustive and should be thought of as representative examples.

a. Limitations of Memory

Memory is the general process of retaining, accessing, and using stored information. Short-term memory is the process of storing small amounts of information for short periods of time. In 1956 George Miller published a paper that helped measure the limitations of human short-term memory. This paper was an early example of the field that would later be known as cognitive psychology. In “The Magical Number Seven, Plus or Minus Two”, Miller argued that short-term memory is limited to approximately seven items (plus or minus two). That is, Miller argued that for typical adult humans, short-term memory is bounded by about nine items. Later work such as Cowan (2001) has suggested that the capacity of short-term memory might be smaller than previously thought, perhaps as small as four items.

In some ways, Miller’s result should be puzzling. Humans are often able to recite long sentences immediately after reading them, so how would this ability square with Miller’s experimental results? Miller also introduced the idea of “chunking” in his famous 1956 paper. To “chunk” items is to group them together as a unit (often by a measure of similarity or meaningfulness). This is an information compression strategy. For example, suppose the task is to remember the following eight words: catching, dog, apples, city, red, frisbees, park, yellow. Likely, this would be somewhat difficult. Instead, suppose the task was to remember the four phrases: yellow dog, red apples, catching frisbees, city park. This should be less difficult, even though the task still involves eight words. The explanation is that the eight items have been “chunked” down to four informational items (to be “uncompressed” later when needed). Yet, the existence of chunking strategies does not mean that short-term memory is unbounded. Typical humans cannot remember more than seven (plus or minus two) chunks, nor is it the case that just any string of information can be chunked. For many subjects, it would be exceedingly difficult to chunk the following eight strings of letters: rucw, mxzq, exef, cfiw, uhss, xohj, mnwf, ofhn.

Long-term memory is the process of storing information for long periods of time. Long-term memory also features kinds of limitation. It may be tempting to think that stored memories are like photographs or video, which may be retrieved and then reviewed as an unaltered representation of an event. But this is not how human memory works. Psychologists have known for a long time that many aspects of memory are “constructive”. That is, factors such as expectation, experience, and background knowledge can alter memories. Humans are prone to omit details of events and even add details that never occurred. Consider the classic example of Bartlett’s “War of the Ghosts” experiment. In 1932 Fredrick Bartlett read British subjects a story from aboriginal Canadian folklore. He then asked the subjects to recall the story as accurately as they were able. This established a baseline of subject performance. Next, Bartlett used the experimental technique of “repeated reproduction” and had subjects retell the story after longer and longer periods of time. Bartlett found that as more time passed, subjects’ retelling of the story became shorter and more and more details were omitted. As well, many subjects added details to the story that reflected their own culture, rather than the cultural setting of the story. As one example, instead of recalling the canoes that were mentioned in the story, many subjects retold the story as concerning boats, which would be more familiar to a British participant.

It has also been demonstrated that for some kinds of information, retrieving an item from memory can reduce the likelihood of successfully retrieving a competing or related item. As a simple example, trying to remember where one last put one’s keys would be much more difficult if competing memories such as where one put the keys two days ago or three days ago were just as likely to be recalled. Instead, it appears as though there is an inhibitory mechanism that suppresses the recall of competing memories (in this case, the older “key location” memory). While potentially beneficial in some respects, this “retrieval-induced forgetting” effect might be harmful in some academic settings. Macrae and MacLeod (1999) gave subjects 20 “facts” about a fictional island. Next, subjects were evenly divided into two groups: group one practiced memorizing only a select 10 of the 20 facts and group two did not practice memorizing any of the 20 facts. Unsurprisingly, group one had better recall than group two on the select 10 facts. But, interestingly, group two had better recall than group one on the other 10 facts. That is, by attempting to memorize some subset of the 20 facts, group one had impoverished recall in the unpracticed subset of facts. This result might have implications for students that attempt to cram for an exam: in cramming for an exam, students may reduce their performance on unstudied material.

In addition to the above limitations, humans also suffer from age related performance decreases in memory. Humans also typically have difficulty in remembering the source of their information (that is, how they initially learned the information). Further, misinformation and suggestion can alter subjects’ memories and even create “false memories”. Eyewitness reports of a crime scene may omit relevant information when a gun is present (known as “weapon focus”), due to the narrow attentional focus on the gun. As well, subtle feedback to an eyewitness report (for example, a police officer says “thanks for helping identify the perpetrator”) can strengthen the eyewitness’ feeling of confidence, but not their reliability.

b. Limitations of Visual Perception

Humans are able to visually detect wavelengths between roughly 400 and 700 nanometers, corresponding to colors from violet to red. Hence, unaided human vision cannot detect much of the information in the electromagnetic spectrum, including infrared and ultraviolet radiation. Under ideal conditions, humans can discriminate between wavelengths in the visible spectrum that differ by only a few nanometers.

It is a mistake to think that, for humans, the entire visual field is uniformly detailed. This is surprising, because it seems (phenomenologically, at least) that most of the visual field is detail rich. Recall the experience of studying the brushstrokes of an artwork at approximately five feet of distance. The uncritical experience suggests that vision always provides highly detailed information—perhaps this is because everywhere one looks there appears to be detail. Yet, there is a sense in which this is an illusion. In the human eye, the fovea is responsible for providing highly detailed information, but the fovea is only a small part of the retina. Eye movements, called saccades, change the location of foveal vision to areas of interest, so details can be extracted where they are wanted. Much of the visual field in humans does not provide detail rich information, and might be described in lay terms as being similar to “peripheral vision”. This non-foveal part of the visual field has limited acuity and results in impoverished perceptual discriminatory ability.

Just as it is incorrect to think that memory works like a photograph, human color vision does not simply provide the color of an object in the way a “color picker” does in a image editing computer program. The color an object appears is often highly sensitive to the amount of light in the environment. Color judgments in humans can be highly unreliable in low light environments, such as when distinguishing green from purple. Human vision is also subject to color constancy in some circumstances. Color constancy occurs when objects appear to stay the same color despite changing conditions of illumination (which change the wavelengths of light that are reflected) or because of their proximity to other objects. For instance, the green leaves of a tree may appear to stay the same color as the sun is setting. Color constancy may be helpful for the tracking or re-identification of an object through changing conditions of illumination, but it may also increase the unreliability of color judgments.

c. Limitations of Attentional Resources

Attention is the capacity to focus on a specific object, stimulus, or location. Many occurrent cognitive processes require attentional resources. Lavie (1995, 2005) has proposed a model that helps explain the relationship between the difficulty of various tasks and the ability to successfully deploy attentional resources. Lavie’s idea is that total cognitive resources are finite, and difficult cognitive tasks take up more of these resources. A direct implication is that comparatively easier tasks allow for available cognitive resources to process “task-irrelevant” information. Processing task-irrelevant information can be distracting and even reduce task performance. For an example of this phenomenon, consider the difference between taking an important final exam and casually reading at a coffee shop. Applying Lavie’s model, taking an important final exam will often use all of one’s cognitive resources, and hence, no task-irrelevant information (such as the shuffling of papers in the room or the occasional cough) will be processed. In this particular instance, the task-irrelevant stimuli cannot be distracting. In contrast, causally reading at a coffee shop typically is not a “high-load” task and does not require most of a subject’s cognitive resources. While reading casually one can still overhear a neighboring conversation or the sound of the espresso machine, sometimes hindering the ability to concentrate on one’s book.

As an example of competition from task-irrelevant stimuli, consider the well-known Stroop effect. First conducted by J.R. Stroop in 1935, the task is to name as quickly as possible the color of ink used to print a series of words. For words such as ‘dog’, ‘chair’ and ‘house’, each printed in a different color, the task is relatively easy. But Stroop had subjects read words such as ‘green’, ‘blue’, and ‘red’ printed in non-representative colors (so ‘red’ might be printed in blue ink). This version of the task is much more challenging, often taking twice as much time as the version without color words. One explanation of this result is that the task-irrelevant information of the color word is difficult to ignore, perhaps because linguistic processing of words is often automatic.

Attentional resources are also deployed in tracking objects in the environment. Object-based attention concerns representing and tracking objects. Xu et al. (2009) report that due to limits on processing resources, the visual system is able to individuate and track about four objects. Sears and Pylyshyn (2000) also cite limits on the capacity to process visual information and have shown that subjects are able to track about five identical objects in a field of ten objects.

2. Idealization

This section will discuss one dismissive response to problems posed by resource bounded agents. The basic idea behind this response is that descriptive facts about cognitive limitations are irrelevant to the normative enterprise.

a. Idealization Strategies

In drafting various normative theories (concerning, for example, rational belief or moral action), some philosophers have claimed to be characterizing “ideal” agents, rather than “real” or “non-ideal” agents like humans (where real or non-ideal agents are those agents that have cognitive limitations). This strategy can be defended on a number of lines, but one defense appeals to theory construction in the physical sciences. In drafting physical theories it is often helpful to first begin with theoretically simple constraints and add in complicating factors later. For instance, many introductory models about forces omit mention of complicating factors such as friction, air resistance, and gravity. Likewise, a philosopher might claim that the proper initial subject of normative theorizing is the ideal agent. As such, descriptive details of the cognitive limitations of non-ideal agents are simply not relevant to initial theorizing about normative standards, because ideal agents do not have cognitive limitations. Yet, the thought is, theories of ideal agents might still be useful for evaluating non-ideal agents. Continuing with the analogy with scientific models, the proposed strategy would be to first determine the normative standard for ideal agents, and then evaluate non-ideal human agents as attempting to approximate this standard.

As one example of this strategy, return to the issue of believing inconsistent propositions. Because ideal agents do not have memory or computational limitations, these agents are able to check any number of beliefs for inconsistency. It then seems that these agents ought not to believe inconsistent propositions. Perhaps the reason for this is that one ought not to believe false propositions, and a set of inconsistent propositions is guaranteed to have at least one false member. This result might serve as one dimension of the normative standard. Now, turning attention to resource bounded agents such as humans, it might be thought that these agents ought to try to approximate this standard, however imperfectly. That is, the best reasoners imaginable will not believe inconsistent propositions, so humans ought to try to approximate the attitudes or behaviors of these reasoners. On this view, better human reasoners believe fewer inconsistent propositions.

A second defense of the idealization strategy appeals directly to the kinds of concepts addressed by normative theories. Many normative concepts appear to admit of degrees. It might be thought that there can be better and worse moral decisions and better and worse epistemic attitudes (given a collection of evidence). If this is correct then, plausibly, ideal agents might be thought to be the best kind of agent and correspondingly the proper subject for normative theorizing. Consider the following example. Suppose a person witnesses an unsupervised child fall off a pier into a lake. In a real case, the human observer might feel paralyzing stress or anxiety about the proper response and thus momentarily postpone helping the child. Such a response may seem less than optimal—it would be better if the agent responded immediately. Considering these optimal responses might necessarily involve imagining ideal agents, because (plausibly) every real agent will have some amount of stress or anxiety. Because ideal agents do not have psychological limitations, an ideal agent would not become paralyzed by stress or anxiety and would respond immediately to the crisis. In this regard, after abstracting away from complicating factors arising from human psychology, ideal agents might help reveal better moral responses.

As briefly mentioned above, idealization strategies often offer a bridge principle, linking the proposed normative standard to real human action and judgment. Of course, human agents are not ideal agents, so how do ideal normative standards apply to real human agents? One common answer is that human agents ought to try to approximate the ideal standards, and better agents more closely approximate these standards. For instance, it is clear that no human agent could achieve a pairwise check of all of their beliefs for logical consistency. But it still might be the case that better agents check more of their beliefs for consistency. Plausibly, young children check few of their beliefs for consistency whereas reflective adults are careful to survey more of the claims they endorse for consistency and coherence. On this measure it is not obviously unreasonable to judge the reflective adult as more rational than the young child.

b. Problems with the Idealization Strategy

One potential problem with the idealization strategy is the threat of incoherence. If every cognitive agent is physically embodied, then every cognitive agent will face some kinds of resource limitation. Hence, it is unclear that ideal agents are either physically possible or even conceivable. What kind of agents are ideal cognizers anyway? Do ideal cognizers even reason or make inferences, given the immediate availability of their information? Should we really think of them as reasoners or agents at all? Ideal cognizers are certainly unlike any cognitive agent with which we’ve ever had any experience. As such, the thought is that little weight should be placed on claims such as “ideal agents are able to check any number of beliefs for inconsistency”, because it is not clear such agents are understandable.

An idealization theorist might respond by leaning on the analogy with model construction in the physical sciences. Introductory models of forces that omit friction, say, may describe or represent physically impossible scenarios but these models nonetheless help reveal actual structural relationships between force, mass, and acceleration (for instance). Perhaps, so too for normative theorizing about ideal agents.

A second potential problem with the idealization strategy concerns possible disanalogies between theorizing in philosophy and the physical sciences. Introductory models of forces in the physical sciences do not yield ultimate conclusions. That is, the general relationship between force and mass that is established in idealized models is later refined and improved upon with the addition of realistic assumptions. These updated models are thought to be superior, at least with respect to accuracy. In contrast, however, many philosophers who claim to be theorizing about ideal agents take their results to be either final or ultimate. As previously mentioned, some epistemologists take belief consistency to be a normative ideal, and adding realistic assumptions to the model does not produce normatively better results. If such a stance is taken, then this weakens the analogy with theory construction in the physical sciences.

A third potential problem with the idealization strategy is that it is not clear that there are unique ideal agents or even unique idealized normative standards. Why should we think that there is one unique ideally rational agent or one unique ideally moral agent, rather than a continuum of better agents (perhaps just as there is no possible fastest ideal marathon runner)? The worry is clear in this respect: if there are only better and better agents (with no terminally best agent) then the study of any particular idealized agent cannot yield ultimate normative standards. It is also not clear that there are always unique idealized normative standards. For instance, it is often assumed that there are optimal decisions or optimal plans for ideal agents to choose. Yet, John Pollock (2006) has argued that there is “no way to define optimality so that it is reasonable to expect there to be optimal plans”. The consequence of this result, if it can be maintained, is that there is no unique optimal plan or set of plans that an ideal agent could choose. Hence, an idealization strategy, one that abstracts away from time and resource constraints on the agent, could not represent ideal plans. It is more controversial as to whether there are optimal belief states that ideal reasoners would converge to, given unbounded time and unbounded cognitive resources.

c. Ought Implies Can

A fourth potential problem with the idealization strategy concerns the well-known and controversial “ought implies can” principle. If true, this principle states that the abilities of the agent constrain normative demands or requirements on the agent. Consider an example from the moral domain. Suppose that, after an accident, a ten ton truck has pinned Abe to the ground and is causing him great harm. Ought a fellow onlooker, Beth, lift the truck and free Abe? Many would claim that because Beth is unable to lift the truck, she has no duty or obligation to lift the truck. In other words, it might seem reasonable to think that Beth must be able to lift the truck for it to be true that she ought to lift the truck. There may well be other things that Beth ought to do in this situation (perhaps make a phone call or comfort Abe), but the idea is that these are all things that Beth could possibly do.

If “ought implies can” principles are true in various normative domains such as ethics or epistemology, then the corresponding idealization strategy would face the following problem. Idealization strategies, by definition, abstract away from the actual abilities of agents (including facts about memory, reasoning, perception, and so forth). Hence, these strategies will not produce normative conclusions that are sensitive to the actual abilities of agents, as “ought implies can” principles require. Hence, idealization strategies are defective.

Said differently, “ought implies can” principles suggest that descriptive facts matter to normative theorizing. As Paul Thagard (1982) has said, epistemic principles “should not demand of a reasoner inferential performance which exceeds the general psychological abilities of human beings”. Of course, idealization strategies necessarily disagree with this claim. If “ought implies can” principles are true then we have reason to reject idealization strategies.

Are “ought implies can” principles true? Intuitively, the Abe and Beth case above seems plausible and reasonable. This provides prima facie evidence that there is something correct about a corresponding moral “ought implies can” principle in the moral domain. However, in epistemology, there are reasons to think that “epistemic oughts” do not always imply “epistemic cans”.

In defending evidentialism, Richard Feldman and Earl Conee (1985) have argued that cognitive limits do not always constrain theories of epistemic justification. As they say, “some standards are met only by going beyond normal human limits”. Feldman and Conee give three examples. The first concerns a human agent whose doxastic attitude a best fits her evidence e, but forming a is beyond the agent’s “normal cognitive limits”. To fill in the details, suppose that the doxastic attitude that best fits Belinda’s evidence is believing that her son is guilty of the crime, but also suppose that Belinda is psychologically unable to appropriately assess her evidence (given its disturbing content). Feldman and Conee think that the intuitive response to such a case would be to think that (believing in guilt) “would still be the attitude justified by the person’s evidence”, even though in this case Belinda faces the impossible task of assessing her evidence. Indeed, it seems that this is a standard response one might have toward family members of guilty defendants: given the evidence, they ought to believe that their loved one is guilty, despite its impossibility. If such a response is correct, then “ought implies can” principles are not always true in epistemic domains.

The second and third examples Feldman and Conee give are the following:

Standards that some teachers set for an “A” in a course are unattainable for most students. There are standards of artistic excellence that no one can meet, or at least standards that normal people cannot meet in any available circumstance.

These latter examples are surely weaker than the first. It would be completely unreasonable for a teacher to adopt a standard for an “A” that was impossible for any student to satisfy (“to get an “A” a student must show that 0 = 1″). However, part of the difficulty here is that the relevant notion of “can” is either vague or ambiguous. Does “can” mean some students could satisfy the standard some times? Or does “can” mean that at least one student could satisfy the standard once? It would not be unreasonable for a teacher to adopt a standard for an “A” that one particular class of students could not attain. The art example is even more difficult. First, the art example is unlike the Abe pinned under the truck example. In that case it was physically impossible for Beth to lift the truck. The art example, however, contains a standard that “normal people cannot meet in any available circumstance”, with the implication that some humans can meet the standard. The difference between these examples is that one is indexed to Beth’s abilities and the other is indexed to human artistic abilities, generally. The worry is that some standards might be “community standards” and hence the relevant counterexample would be a case where no one in the community could meet the standard. Indeed, it would be an odd artistic standard such that no possible human could ever satisfy it.

Lastly, it is unclear whether Feldman and Conee’s remarks can be generalized to other normative domains. Even if Feldman and Conee are correct in thinking that various “epistemic oughts” do not imply “epistemic cans”, it is not obvious whether similar considerations hold in the domain of morality or rational action.

3. Accommodating Cognitive Limitations

The second major kind of response to resource bounded agents is to accommodate the descriptive facts of cognitive limitations into one’s normative theory. Proponents of this response claim that facts about cognitive limitations matter for normative theories. To continue with the example of believing inconsistent propositions, a theorist that adopted a version of this response might attempt to argue that resource bounded agents ought not to believe “feasibly reached” or, instead, “obvious” inconsistent propositions. This response would accommodate facts about cognitive limitations by relaxing the standard “never believe any set of inconsistent propositions”.

There are two ways in which one might attempt to accommodate cognitive limitations into one’s normative theorizing. First, similar to the above example, one might “change the normative standard” and argue that resource bounded agents show that normative standards should be relaxed in some way. Versions of this response will be discussed in section 3a. Second, one might instead argue that cognitive limitations show that the agents being investigated cannot meet the proposed normative standard, and hence, are inherently defective in some dimension. This response will be discussed in section 3b.

a. Changing the Normative Standard

In this subsection, the article discusses several prominent views that accommodate descriptive facts about cognitive limitations by augmenting or changing normative standards.

i. Simon’s “Satisficing View” of Decision Making

One way to accommodate the cognitive limitations that agents face is to relax the traditional normative standards. In the domain of rational decision making, Herbert Simon (1955, 1956) replaced the traditional “optimization” view of the rationality of action with the more relaxed “satisficing” view. To illustrate the difference between optimization procedures and satisficing procedures, consider the well-known “apartment finding problem”. Presumably, when searching for an apartment one values several attributes (perhaps cost, size, distance from work, quiet neighborhood, and so forth). How ought one choose? The optimization procedure recommends maximizing some measure. For example, one way to proceed would be to list every available apartment, assess each apartment’s total subjective value under the various attributes, determine the likelihoods of obtaining each apartment, and then calculate this “weighted average” and choose the apartment that optimizes or maximizes this measure. Simon noticed that such an optimization procedure is typically not feasible for humans: it is too computationally demanding. For one, the complete information about apartment availability or even complete information about apartment attributes is often unavailable. Secondly, the relevant probabilities are crucial to an optimization strategy, but these probabilities are too cognitively demanding for typical human agents. For example, what is the probability that apartment B will still be available if the initial offer for apartment A gets rejected? How would one calculate this probability? Instead, Simon suggests that humans ought to make decisions by “satisficing”, or deciding to act when some threshold representing a “good enough”, but not necessarily best or optimal, outcome is achieved. To satisfice in the apartment finding problem, one determines some appropriate threshold or aspiration level of acceptability (representing “good enough”), and then one searches for an apartment until this threshold is reached. A satisficer picks the first apartment that surpasses this threshold.

It is important to note that, under a common interpretation, Simon is not recommending the satisficing procedure as a next best alternative to the optimization procedure. Instead, Simon is suggesting that the satisficing procedure is the standard by which to judge rational action. Correspondingly, human agents who do not optimize in the sense described above are not normatively defective qua rational decision maker.

One claimed advantage of satisficing over optimization concerns computational costs. A satisficing strategy is thought to be less computationally intensive than an optimization strategy. Optimization strategies require the computation of “expected values” based on a network of probabilities and subjective values, and also the computational resources to store and compare these values. Satisficing strategies, by contrast, only require that an agent is able to compare a possible choice with a threshold value, and there is no need to store past assessments (other than the fact that a past choice was assessed). A second advantage of satisficing is that it seems to come close to describing how humans actually solve many decision problems and, as well, appears to be predictively successful. For better or worse, humans do seem to pick apartments, cars, perhaps even mates that are “good enough” rather than optimal (and note that someone like Simon would say this is “for the better”).

Two criticisms of satisficing concern its stability over time and the setting of satisficing thresholds or aspiration levels. A benefit of the optimization procedure is that an agent can be confident that her decision is the best in a robust sense—in comparison with any other alternative, the optimal option will be superior to this alternative. However, if one picks option a under a satisficing procedure, one cannot be confident that option a will be superior to any other future alternative option b. In fact, one cannot be confident that the next alternative option is not better than the current option. This is potentially problematic in the following sense. If one sets one’s satisficing threshold too low, one may quickly find a choice that surpasses this threshold, but is nonetheless unacceptable in a more robust sense. For example, buying the first car one sees on the sales lot is often not recommended, however easy this strategy is to follow. In this example the threshold for “good enough” is clearly too low. This leads to the second broad criticism. When factoring in the calculation needed to determine how low or high to set the satisficing threshold, it is not obvious whether satisficing procedures retain their computational advantage. As previously mentioned, a satisficing threshold that recommends buying the first car one sees on the sales lot is too low. But what threshold should count as representing a “good enough” car? In most cases this is a difficult question. Intuitively, a “good enough” car is one that has some or many desirable features. But is this a probabilistic measure—must these desirable features be known to obtain with the choice selection or are they merely judged to be probable? Further, how does one compute the relationship between some particular feature of the car and its desirability? The worry is that setting appropriate satisficing thresholds is as difficult as optimizing. Serious concern with these kinds of issues puts pressure on the claim that satisficing procedures have clear computational advantages.

ii. Pollock’s “Locally Global” View of Planning

John Pollock is also critical of optimization strategies for theories of rational decision making, for reasons concerning cognitive limitations. However, rather than focus on the rationality of individual decision problems (such as the apartment finding problem or the car buying problem mentioned above), Pollock’s view concerns rational planning. To see the difference between individual decision problems and planning problems, consider the following example. In deciding what to do with one’s afternoon, one might decide to go to the bank and go to the grocery store. By deciding, one has solved an individual decision problem. However, there are two important issues that are still unresolved for the decision: (1) how to implement the decisions “go to the bank” and “go to the grocery store” (go by car or by bus or walk?) and (2) how to structure the order of individual decisions (go to the bank first, then go to the grocery store second?). Planning generally concerns the implementation and ordering issues illustrated in both (1) and (2). When agents engage in planning they attempt to determine what things to do, how to do them, and how to order them.

Planning is often regarded as more broad than the field of “decision theory”, which typically focuses on the rationality of individual actions. Research in artificial intelligence concerning action almost exclusively focuses on planning. One reason for this focus is that many AI researches want to build agents that operate in the world, and operating in the world requires more than just deciding whether to perform some particular action. As illustrated above, there are often many ways to perform the same action (one may “go to the bank” by traveling by car or by boat or by jet pack). As well, actions are performed in temporal sequence with other actions, some of which potentially conflict (for example, if the bank closes at 4pm, then it is impossible to go to the bank after one goes to the grocery store).

Now, how ought rational agents to plan? One suggestion is that rational agents choose optimal plans, in a way similar to the optimization procedure mentioned in section 3ai above. An optimal plan is a plan that maximizes some measure (such as expected utility, for example). A simple version of a plan-based optimization procedure might include the following: (i) survey all possible plans and (ii) choose the plan that maximizes expected utility. Many of the claimed virtues for the optimization procedure of individual decisions discussed in section 3ai above also count as virtues of the plan-based optimization procedure.

John Pollock has argued that real, non-ideal agents ought not use plan-based optimization procedures. Part of his argument shares reasons given by Herbert Simon: resource bounded agents such as humans cannot survey and manage the information required to optimize. Further, Pollock responds to this situation in a similar way to Simon. Rather than claim that informational resource limitations show that humans are irrational, Pollock argues that the correct normative standard is actually less demanding and can be satisfied by human agents.

One feature of Pollock’s argument is similar to Christopher Cherniak’s (1986) observation about the inherent informational complexity of a complete consistency check on one’s belief corpus. Pollock argues that because plans are constructed by adding parts or “sub-plans”, the resulting complexity is such that it is almost always impossible to survey the set of possible plans. For example, suppose an agent considers what plan to adopt for the upcoming week. In a week, an agent might easily make over 300 individual decisions, and a plan will specify which decision to implement at each time. Further, suppose that there are only 2 alternative options for each individual decision. This entails that there are 2^300 possible plans for the week to consider, or, approximately 10^90 plans, a number greater than the estimated number of elementary particles in the universe. Obviously, human agents cannot survey or even construct or represent the set of possible plans for a week of decisions. Actually, the situation is much worse. Rational planning includes what things to do, how to do them, and how to order them, and additionally what may be called “contingency plans”. One might adopt a plan to drive to the airport on Sunday, but this plan might also include the contingency plan “if the car won’t start, call a taxi”. Optimization procedures would require selecting the maximally best contingency plans for a given plan (it would typically not be recommended to try to swim to the airport if one’s car won’t start), but additionally surveying and constructing the set of all possible contingency plans only furthers the computational complexity problem with the optimization procedure.

Instead of optimization, Pollock argues that non-ideal human agents should engage in “locally global” planning. Locally global planning involves beginning with a “good enough” master plan (an idea Pollock acknowledges is reminiscent of Simon’s satisficing view), but continually looking for and making small improvements to the master plan. As Pollock claims, “the only way resource bounded agents can efficiently construct and improve upon master plans reflecting the complexity of the real world is by constructing or modifying them incrementally”. The idea is that resource bounded agents ought to defeasibly adopt a master plan which is “good enough”, but continually seek improvements as new information is obtained or new reasoning is conducted.

iii. Cherniak’s “Minimal Rationality” and “Feasible Inferences”

Chistopher Cherniak’s (1986) Minimal Rationality is a seminal work in the study of resource bounded agents, and it discusses the general issue of the relationship between cognitive limitations and normative standards. He begins by arguing against both idealized standards of rationality (“finitary” agents such as humans could never satisfy these conditions) and a “no standards” view of rationality (unlike agents we recognize, such agents would never generate any predictions on their behavior). The third alternative, that of “minimal rationality” suggests “moderation in all things, including rationality”. Cherniak claims that many of the minimal rationality conditions can be derived from the following principle:

(MR) If A has a particular belief-desire set, A would undertake some, but not necessarily all, of those actions that are apparently appropriate.

For example, Cherniak is clear in suggesting that rational agents need not eliminate all inconsistent beliefs. This generates the following “minimal consistency condition”:

(MC) If A has a particular belief-desire set, then if any inconsistencies arose in the belief set, A would sometimes eliminate some of them.

In support of (MC), Cherniak argues that non-minimal, ideal views of rationality (ones that suggest agents ought to eliminate all inconsistencies) would actually entail that humans are irrational. As he claims, “there are often epistemically more desirable activities for [human agents] than maintaining perfect consistency”. The idea is that given the various cognitive limitations that humans face (the “finitary predicament”), it would be irrational for any human to attempt to satisfy the Sisyphean task of maintaining a consistent belief corpus.

There are two prominent objections to Cherniak’s minimal consistency condition. First, as Daniel Dennett and Donald Davidson have pointed out in various works, it is difficult to understand or ascribe any beliefs to agents that have inconsistent beliefs. For instance, suppose that Albert believes that p, and that p entails q, but also suppose that Albert believes that q is false. What is Albert’s view of the world? In one sense, it may be argued that Albert has no view of the world (and hence no beliefs) because, ultimately, Albert might be interpreted as believing both q and ¬q, and there is no possible world that could satisfy such conditions. In response, Cherniak invokes an “ought implies can” principle. He suggests that once an agent meets a threshold of minimal rationality, “the fact that a person’s actions fall short of ideal rationality need not make them in any way less intelligible to us”. As such, Cherniak’s response could be understood in a commonsense way: typical human agents have some inconsistent beliefs, but we nonetheless ascribe beliefs to them.

A second objection to Cherniak’s minimal consistency condition concerns the permissiveness of the condition. As Appiah (1990) has worried, “are we left with constraints that are sufficiently rich to characterize agency at all”? As an example, an agent that eliminates a few inconsistent beliefs only on Tuesdays would satisfy (MC). Yet there is something intuitively defective about such a reasoner. Instead, it seems that what is wanted is a set of constraints on reasoners, reasoning, and agency that are more strict and more demanding than Cherniak’s minimal rationality conditions. Perhaps anticipating objections similar to Appiah’s, Cherniak developed what he calls a theory of “feasible inferences”. A theory of feasible inferences recruits descriptive facts about cognitive limitations to provide more restrictive normative requirements. For instance, a theory of “human memory structure” describes what information is cognitively available to human agents, given various background conditions. In general terms, when information is cognitively available to an agent, more normative constraints are placed on the agent. Correspondingly, conditions such as (MC) would thereby be strengthened.

However, it is unclear whether a theory of human memory structure will provide enough detail to propose a “rich structure of constraints” on rationality or agency. For one, Cherniak’s theory of human memory structure describes typical humans. There is even a sense in which “typical human” is an idealized notion since no individual is a typical human. Given that there are individual differences in memory abilities between humans, which constraints should be adopted? If an inference to q is obvious for Alice but it would not be obvious for a typical human, is Alice required to believe q (on pain of irrationality) or is it merely permissible for her to believe q? Note that proponents of idealization strategies (as discussed in section 2) are able to provide a rich structure of constraints and do not have to worry about individual differences in cognitive performance.

iv. Gigerenzer’s “Ecological Rationality”

Gerd Gigerenzer views rationality as fundamentally involving considerations of the agent’s environment and the agent’s cognitive limitations. Similar to many of the theorists discussed above, Gigerenzer also cites Herbert Simon as an influence. Many aspects of Gigerenzer’s view may be understood as responding to the influential project of psychologist Daniel Kahneman, to which this article will turn next.

Gigerenzer (2006) is clear in his rejection of “optimization” views of rationality, which he sometimes calls “unbounded rationality”. As he claims,

. . . it is time to rethink the norms, such as the ideal of omniscience. The normative challenge is that real humans do not need. . . unlimited computational power.

In place of optimization procedures, Gigerenzer argues that resource bounded agents ought to use “heuristics” which are computationally inexpensive and are tailored to the environment and abilities of the agent (and are, hence, “fast and frugal”). Rationality, for Gigerenzer, consists in the deployment of numerous, however disparate, fast and frugal heuristics that “work” in an environment.

To understand Gigerenzer’s view, it is helpful to consider several of his proposed heuristics. For the first example, consider the question of who will win the next Wimbledon tennis championship. One way to answer this question, perhaps in line with the optimality view of rationality, would be to collect vast amounts of player performance data and make statistical predictions. Surely, such a strategy is computationally intensive. Instead, Gigerenzer suggests that in some cases it would be rational to use the following heuristic:

Recognition Heuristic: If you recognize one player but not the other, then infer that the recognized player will win the particular Wimbledon match.

First, the recognition heuristic is obviously computationally cheap—it does not require informational search or deep database calculations, or the storage of large amounts of data. Second, the recognition heuristic is incredibly fast to deploy. Third, this heuristic is not applicable in all environments. Some agents will not be able to use this heuristic because they do not recognize any tennis player, and some agents will not be able to use this heuristic because they recognize every tennis player. Fourth, it is essential to note that proper use of the recognition heuristic, in Gigerenzer’s view, results in a normatively sanctioned belief or judgment. That is, when agents use the recognition heuristic in the appropriate environment, the resulting belief is rational. For instance, if Mary only recognizes Roger Federer in the upcoming match between Federer and Rafael Nadal, then it is rational for her to believe that Federer will win.

Some may find this last result surprising or counterintuitive—after all, Mary may know very little about tennis, so how can she have a rational belief that a particular player will win? Gigerenzer would reply that such surprise or counterintuitiveness probably results from holding an optimality view of rationality. Gigerenzer’s project is an attempt to argue that rationality does not consist in gathering large amounts of information and making predictions on this basis. Rather, Gigerenzer thinks that rationality consists in using limited amounts of information in efficient or strategic ways, with the caveat that the proper notion of efficiency and strategy are not idealized notions, but concern the agent’s cognitive limitations and environment.

Now turn to the important question: does the recognition heuristic work? Gigerenzer (2007) found that in approximately 70% of Wimbledon matches, the recognition heuristic predicted the winning player. That is, for agents that are “partially ignorant” about tennis (those that know something about tennis but are not experts) the recognition heuristic gives better-than-chance predictive success.

Consider another heuristic. Humans need to track objects in the environment such as potential threats and sources of food. One way to track an object would be to calculate its trajectory using properties of force, mass, velocity and a series of differential equations. Some AI systems attempt to do just this. It is clear that humans do not explicitly solve differential equations to track objects, but it is also not obvious that humans do this even at a subconscious or automatic level. Gigerenzer (2007) proposes that humans use a “gaze heuristic” in specific situations. For example, consider the problem of tracking an oncoming plane while flying an airplane. One way to infer where an approaching plane will be is to use a series of mathematical formulae involving trajectories and time. A second way would be to use the following gaze heuristic:

Gaze Heuristic: Find a scratch or mark in your airplane windshield. If the other plane does not move relative to this mark, dive away immediately.

As with the recognition heuristic, the gaze heuristic is computationally cheap and fast. Further, this heuristic is not liable to induce calculation errors (as may be the case with the mathematical equations strategy).

Gigerenzer has also argued that a version of the gaze heuristic is used by outfielders when attempting to catch fly balls. This heuristic consists of the following instructions: fix your gaze on the ball, start running, and adjust your running speed so that the image of the ball rises at a constant rate. Interestingly, Shaffer et al. (2004) attached a small camera to dogs when they were fetching thrown frisbees, and it appears that dogs may too use the gaze heuristic. If so, a plausible explanation seems fitting with Gigerenzer’s proposal: in the face of resource limitations, many agents use inference strategies that are fast and frugal, and work in their environment.

One initial worry for Gigerenzer’s project of finding fast and frugal heuristics is that it is not clear there are enough heuristics to explain humans’ general rationality. If a non-expert correctly infers that an American will hit the most aces during Wimbledon, was this an inference based on the recognition heuristic (it is not obvious that it must be), or is there an additional heuristic that is used (perhaps a new heuristic that only concerns aces hit in a tennis match)? Gigerenzer is clear in his rejection of “abstract” or “content-blind” norms of reasoning that are general purpose reasoning strategies, but his alternative view may be forced to posit a vast number of heuristics to explain humans’ general rationality. Further, a cognitive system that is able to correctly deploy and track a vast number of heuristics does not obviously have a clear computational advantage.

A second worry concerns the “brittleness” of the proposed heuristics. For instance, referencing the above mentioned recognition heuristic, what ought one to infer in the case of a tennis match where the recognized player becomes injured on court? Of course, the recognition heuristic is not adaptable enough to handle this additional information (with the idea being that injured players, however excellent, are typically unlikely to win). So, there may be instances in which it is rational to override the use of a heuristic. But positing a cognitive system that monitors relevant additional information and judges whether and when to override the use of a specific heuristic might erase much of the alleged computational advantages that heuristics seem to provide.

b. Failing to Meet the Standard

This article will now address the remaining response by theorists to accommodate the facts of cognitive limitations into their normative theorizing. Some philosophers and psychologists have used facts about cognitive limitations to argue that humans fail to meet various normative standards. For instance, one might argue that humans’ inherent memory limitations and corresponding inability to check beliefs for logical consistency entail that humans are systematically irrational. One might argue that humans’ inherent inability to survey all relevant information in a domain entails that all humans are systematically deluded in that domain. Or, concerning morality, one might attempt to argue that cognitive limitations entail that humans must be systematically immoral, because no human could ever make the required utility calculations (of course, under the assumption of a particular consequentialist moral theory).

Though all of the example positions in the above paragraph are somewhat simplistic, they all roughly share the following features: (i) the claim of a somewhat idealized or “difficult to obtain” normative standard and (ii) the claim that facts about cognitive limitations are relevant to the normative enterprise and show that agents cannot meet this normative standard. As a quick review of material covered in previous sections, theorists such as Herbert Simon, John Pollock, Christopher Cherniak, and Gerd Gigerenzer would reject feature (i), because, in very general terms, they have argued that cognitive limitations provide reason for thinking that the relevant normative standards are not idealized and are not “difficult to obtain”. Proponents of the idealization strategy, such as many Bayesians in epistemology, would reject (ii), because they view the cognitive limitations of particular cognitive agents as irrelevant to the normative enterprise.

i. Kahneman and Tversky’s “Heuristics and Biases” Program

Daniel Kahneman and Amos Tversky are responsible for one of the most influential research programs in cognitive psychology. Their basic view is that human agents reason and make judgments by using cognitive heuristics, and that these heuristics produce errors. Hence the label “heuristics and biases”. Though Kahneman and Tversky have taken a nuanced position regarding the overall rationality of humans, others such as Piatelli-Palmarini (1994) have argued that work done in the heuristics and biases program shows that humans are systematically irrational.

Before discussing some of Kahneman and Tversky’s findings, it is important to note two things. First, though both Gigerenzer and Kahneman and Tversky use the name “heuristics”, these theorists plausibly mean to describe different mechanisms. For Gigerenzer, reasoning heuristics are content-specific and are typically tied to a particular environment. For Kahneman and Tversky, heuristics are understood more broadly as a “shortcut” procedure for reasoning or as a reasoning strategy that excludes some kinds of information. Notoriously, Gigerenzer is critical of Kahneman and Tversky’s characterization of heuristics, claiming that their notion is too vague to be useful. Second, Gigerenzer and Kahneman and Tversky evaluate heuristics differently. For Gigerenzer, heuristics are normatively good (in situations where they are “ecologically rational”), and they are an essential component of rationality. Kahneman and Tversky, however, typically view heuristics as normatively suspect since they likely lead to error.

To begin, consider Kahneman and Tversky’s heuristic of “representativeness”. As they say, “representativeness is an assessment of the degree of correspondence between a sample and a population, an instance and a category, an act and an actor or, more generally, between an outcome and a model”. By using the representativeness heuristic, for one example, a subject might infer that a typical summer day is warm and sunny because it is a common and frequent event, and hence, representative.

Kahneman and Tversky claim that the representativeness heuristic drives some proportion of human probability judgments. They also claim that the use of this heuristic for probability judgments leads to systematic error. In one experiment Tversky and Kahneman (1983) gave subjects the following description of a person and then asked them a probability question about this description. This is the well-known “Linda the bank teller” description: “Linda is 31 years old, single, outspoken and very bright. She majored in philosophy. As a student, she was deeply concerned with issues of discrimination and social justice, and also participated in anti-nuclear demonstrations”. Next, Kahneman and Tversky asked subjects which of the two statements was more probable (given the truth of above description): (T) Linda is a bank teller, or (T&F) Linda is a bank teller and is active in the feminist movement. Kahneman and Tversky report that approximately 85% of subjects judge (T&F) as more probable than (T). Before discussing the alleged incorrectness of this judgement, why might subjects make this judgment? The thought is that, given the description of Linda being an activist in social justice movements and perhaps a philosophy major, (T&F) is more representative of Linda than (T). If Kahneman and Tversky are right in thinking that representativeness drives judgment about probabilities, then their model could explain the result of the Linda case.

But ought agents to judge that (T&F) is more probable than (T), given the description of Linda? This is the important normative question. Kahneman and Tversky rely on the probability calculus as providing the normative standard. According to many versions of the probability calculus, prob(a) ≥ prob(a&b), regardless of the chosen a or b. This may be called “the conjunction rule” for probabilities. The basic idea is that a narrower or smaller class of objects is never more probable than a larger class, and that the overlap of two classes cannot be larger than one of the individual classes. For example, which class is larger, the class of all trucks (Tr) or the class of all white trucks (W&Tr)? Clearly, the answer is the class of all trucks, because every white truck is also a truck. So, which is more probable, that there is a truck parked in front of the White House right now (Tr) or that there is a white truck parked in front of the White House right now (W&Tr)? Plausibly, it is more likely that there is a truck parked in front of the White House (Tr), because any white truck is also a truck, and hence would also count toward the likelihood of there being a truck parked there.

Kahneman and Tversky appeal to the probability calculus as providing the normatively correct rule of reasoning for the Linda case. Because 85% of subjects responded that (T&F) was more probable than (T), against the conjunction rule, Kahneman and Tversky claim that most subjects made an incorrect judgment. So, on their view, this is a case where resource limitations cause human agents to use shortcut procedures such as the representativeness heuristic, and the representativeness heuristic gets the wrong answer. Hence, the representativeness heuristic is responsible for a cognitive bias.

The alleged cognitive bias in the Linda case provides just one part of Kahneman and Tversky’s overall program of heuristics and biases. They have argued that human subjects make errors with insensitivity to prior probabilities, insensitivity to sample size, misconceptions of chance, and misconceptions of regression. Importantly, these claims rely on the probability calculus as providing the correct normative standard. But should we think that the probability calculus provides the correct normative standard for rationality?

One straightforward reason to think that the probability calculus provides the correct normative standard for rational belief concerns logical consistency. Violation of the standard axioms of the probability calculus entails a set of inconsistent probabilistic statements. As such, degrees of belief that satisfy the probability calculus are often called “coherent” degrees of belief. For reasons similar to those given in the introduction to this article, it is often thought that it is not rational to believe a set of inconsistent propositions. Hence, it seems rational to obey the probability calculus.

However, there are significant worries with thinking that the probability calculus provides the correct normative standard for rationality. First, following the rules of the probability calculus is computationally demanding. Independent of Kahneman and Tversky’s experimental results, we should anticipate that few humans would be able to maintain coherent degrees of probabilistic belief, for reasons of computational complexity alone. This observation would entail that humans are not rational, yet this goes against our commonsense view that humans are often quite rational. Indeed, it might be difficult to explain how we’re able to predict human behavior without the corresponding view that humans are usually rational. Insofar as our commonsense view of human rationality is worth preserving, we have reason to think that the probability calculus does not provide a correct normative standard.

A second worry concerns tautologies. According to standard interpretations of probability, every tautology gets assigned probability 1. But if the probability calculus provides a normative standard for belief, then it is rational for us to believe every tautology (for any set of evidence e). But this seems wrong. There are many complex propositions that are difficult to parse or interpret or even understand, but are nonetheless tautologies. Until one recognizes these propositions as instances of a tautology, it does not seem rational to believe just any tautology.

A third and final worry concerns the psychological nature and phenomenology of belief. If the probability calculus provides the correct normative standard for belief then most of our contingent beliefs (for example, “the coffee cup is on the desk”) will have a precise numerical probability assignment, and this number will be less than 1. Call beliefs that are less than 1 but greater than 0.5 “likely beliefs”. Many of our familiar contingent beliefs will be likely beliefs (hence, getting some number assignment such as 0.99785), but it is unclear that our cognitive systems would be able to store or even compute vast amounts of probabilistic information. Belief seems to not work this way. There are, of course, projects in artificial intelligence that attempt to model similar probabilistic systems, but their results have not been universally convincing. Secondly, the phenomenology of belief suggests that many of our contingent beliefs are not “graded” entities that admit of some number, but are binary or “full” beliefs. When one believes that “the coffee cup is on the desk” it often feels like one “fully” believes it, rather than merely “partially” believing it (as would be required if the belief were assigned probability 0.99785). As an example, when reasoning about contingent matters of fact, we often treat our beliefs as full beliefs. Hence, the following reasoning seems both commonplace and acceptable, and does not require probabilities: “I think the coffee cup is in the office, so I should walk there to get the cup”. Hence, the phenomenology of belief gives a possible reason to doubt that the probability calculus provides the correct normative standard for belief.

4. References and Further Reading

a. References

Appiah, Anthony. (1990). “Minimal Rationality by Christopher Cherniak.” The Philosophical Review, 99 (1): 121–123.
Bartlett, Fredrick C. (1932). Remembering: A Study in Experimental and Social Psychology, Cambridge, Cambridge University Press.
Cherniak, Christopher. (1986). Minimal Rationality, Cambridge, MIT Press.
- An important work in the study of resource bounded agents. Discusses idealization in theories of rationality and conditions for agenthood.
Cowan, N. (2001). “The Magical Number 4 in Short-Term Memory: A Reconsideration of Mental Storage Capacity.” Behavioral Brain Science, 24: 87–185.
Feldman, Richard and Conee, Earl. (1985). “Evidentialism.” Philosophical Studies, 48: 15–34.
- Contains a discussion of “ought implies can” principles in epistemology.
Gigerenzer, Gerd. (2006). “Bounded and Rational.” In Stanton, Robert J. (ed.) Contemporary Debates in Cognitive Science, Oxford, Blackwell.
Gigerenzer, Gerd. (2007). Gut Feelings: The Intelligence of the Unconscious, New York, Viking.
- Summarizes and illustrates Gigerenzer’s program of “fast and frugal” heuristics, and is intended for a wide audience.
Lavie, N. (1995). “Perceptual Load as a Necessary Condition for Selective Attention.” Journal of Experimental Psychology: Human Perception and Performance, 21: 451–468.
Lavie, N. (2005). “Distracted and Confused? Selective Attention Under Load.” Trends in Cognitive Science, 5: 75–82.
Macrae, C.N. and MacLeod, M.D. (1999). “On Recollections Lost: When Practice Makes Imperfect.” Journal of Personality and Social Psychology, 77: 463–473.
Miller, George A. (1956). “The Magical Number Seven, Plus or Minus Two: Some Limits On Our Capacity For Processing Information.” The Psychological Review, 63 (2): 81–97.
- Classic paper on memory limitations and an early example of the fields of cognitive science and cognitive psychology.
Piattelli-Palmarini, Massimo. (1994). Inevitable Illusions: How Mistakes of Reason Rule Our Minds, New York, John Wiley and Sons.
- Applies elements of the “heurisitics and biases” program and argues that these results help reveal common errors in judgment.
Pollock, John. (2006). Thinking About Acting: Logical Foundations for Rational Decision Making, Cambridge, Oxford University Press.
- Applying work from epistemology and cognitive science, Pollock proposes a theory of rational decision making for resource bounded agents.
Sears, Christopher R. and Pylyshyn, Zenon. (2000). “Multiple Object Tracking and Attentional Processing.” Canadian Journal of Experimental Psychology, 54 (1): 1–14.
Shaffer, Dennis M., Krauchunas, Scott M., Eddy, Marianna, and McBeath, Michael K. (2004). “How Dogs Navigate to Catch Frisbees.” Psychological Science, 15 (7): 437–441.
Simon, Herbert A. (1955). “A Behavioral Model of Rational Choice.” The Quarterly Journal of Economics, 69 (1): 99–118.
Simon, Herbert A. (1956). “Rational Choice and the Structure of the Environment.” Psychological Review, 63 (2): 129–138.
- An early description of the satisficing procedure.
Stroop, J.R. (1935). “Studies of Interference In Serial Verbal Reactions.” Journal of Experimental Psychology, 18: 643–662.
Thagard, Paul. (1982). “From the Descriptive to the Normative in Psychology and Logic.” Philosophy of Science, 49 (1): 24–42.
Tversky, Amos and Kahneman, Daniel. (1983). “Extensional Versus Intuitive Reasoning: The Conjunction Fallacy in Probability Judgment.” Psychological Review, 90 (4): 293–315.
- Contains the well-known “Linda” example of the conjunction fallacy in probabilistic judgment.
Xu, Yaoda and Chun, Marvin. (2009). “Selecting and Perceiving Multiple Visual Objects.” Trends in Cognitive Science, 13 (4): 167–174.

b. Further Reading

Bishop, Michael A. and Trout, J.D. (2005). Epistemology and the Psychology of Human Judgment, Oxford, Oxford University Press.
- Discusses and offers critiques of various epistemic norms, often citing important work in cognitive science and cognitive psychology.
Christensen, David. (2005). Putting Logic in its Place, Cambridge, Oxford University Press.
- Provides discussion about the use of idealized models. Argues that the unattainability of idealized normative standards in epistemology does not undermine their normative force.
Gigerenzer, Gerd and Selten, Reinhard (eds.). (2001). Bounded Rationality: The Adaptive Toolbox, Cambridge, MIT Press.
- An influential collection of papers on bounded rationality.
Goldstein, E. Bruce. (2011). Cognitive Psychology: Connecting Mind, Research, and Everyday Experience. Belmont, Wadsworth.
- Introductory text in cognitive psychology. Some of the examples of cognitive limitations from section 1 were drawn from this text.
Kahneman, Daniel. (2011). Thinking Fast and Slow. New York, Farrar, Straus, and Giroux.
- Provides an overview of the “heuristics and biases” program and the two-system model of judgment.
Morton, Adam. (2012). Bounded Thinking: Intellectual Virtues for Limited Agents, Oxford, Oxford University Press.
- A virtue-theoretic account of bounded rationality and bounded thinking. Addresses how agents should manage limitations.
Rubinstein, Ariel. (1998). Modeling Bounded Rationality, Cambridge, MIT Press.
- Provides examples of formal models for resource bounded agents.
Rysiew, Patrick. (2008). “Rationality Disputes — Psychology and Epistemology.” Philosophy Compass, 3 (6): 1153–1176.
- Good discussion and overview of the “rationality wars” debate in cognitive science and epistemology.
Simon, Herbert A. (1982). Models of Bounded Rationality, Vol. 2, Behavioral Economics and Business Organization. Cambridge, MIT Press.
- Collection of some of Simon’s influential papers on bounded rationality and procedural rationality.
Weirich, Paul. (2004). Realistic Decision Theory: Rules for Nonideal Agents in Nonideal Circumstances, Oxford, Oxford University Press.
- Argues for principles of decision making that apply to realistic, non-ideal agents.

Author Information

Jacob Caton
Email: jcaton@astate.edu
Arkansas State University
U. S. A.

Gender in Chinese Philosophy

The concept of gender is foundational to the general approach of Chinese thinkers. Yin and yang, core elements of Chinese cosmogony, involve correlative aspects of “dark and light,” “female and male,” and “soft and hard.” These notions, with their deeply-rooted gender connotations, recognize the necessity of interplay between these different forces in generating and carrying forward the world. The major thinkers of China’s first philosophic flourishing—traditionally referred to as the Hundred Schools, c. 500s-200s B.C.E.—inherited and further developed this comprehensively gendered view of the world. These concepts continue to shape contemporary Chinese thought, as well. Historically, the most influential Chinese perspectives on the issue of gender come from what are commonly referred to as Confucian and Daoist traditions of thought, which take somewhat opposing positions. Many texts associated with Confucianism emphasize yang’s dominant, male-related characteristics, whereas those linked to Daoism, especially the Laozi, reverse this view, finding value in yin’s subordinate, female characteristics. However, it should be noted that Chinese thinkers, regardless of their classification as Confucian or Daoist, generally see the opposing qualities of yin and yang as integral parts of a whole that complement one another. Accordingly, the closest word to “gender” in modern Chinese is xingbie, which can be quite literally understood as a difference (bie) of individual nature or tendencies (xing). The word generally, however, refers to the physiological characteristics that then provide the basis for corresponding social identities. The genders, in terms of social roles, are not defined absolutely or theoretically, but rather through the mutually reciprocal, physical, generative relationship between male and female. They are understood correlatively, and determined by their context and dynamic tendencies as they interact with one another. Such traditions within Chinese thought may be applied as resources for contemporary feminist philosophy, albeit not without considerable caution.

Introduction
Human Tendencies (Nature) and Gender
Gender Cosmology
Gender and Social Order
Family Patterns
Chinese Cultural Resources for Feminism
References and Further Reading

1. Introduction

There is a debate in contemporary Chinese academic circles about whether or not the idea of “gender” or “gender concepts” actually applies to traditional Chinese thought. Chinese scholars argue about the presence of “male” (xiong) and “female” (ci) characteristics, differences, and relations in the context of ancient Chinese philosophy. Although affirming this interpretation would provide a space for comparative studies with Western traditions, some thinkers believe that doing so distorts traditional Chinese thought.

Zhang Xianglong is a prominent representative of those who think that Chinese philosophy and culture have long been influenced by concepts of gender. For him, Chinese thinking is fundamentally gendered as it takes the interaction between male and female as the basic model for philosophical investigations. He further argues that it is one of the core aspects of mainstream thought in China. Zhang demonstrates that yin and yang strongly connote ideas of female and male, and identifies such gendered thought in works as early as the Zhouyi, or Yijing (Book of Changes), a traditional Chinese divinatory text of uncertain antiquity consisting of hexagrams and their interpretations, as well as throughout the later traditions of Confucianism, Daoism, and Chinese Buddhism. Accordingly, he argues that yin and women have “in principle never been doomed to be inferior” and “discrimination against women in ancient Chinese culture is neither deterministic nor universal” (Zhang 2002:5). Such a claim is dubious, as the dualistic dynamic of yin and yang, while positing both aspects as essential to existence and in this way ontologically equal, has been generally presented as inherently hierarchical. Chen Jiaqi opposes Zhang’s broader position, arguing that yin and yang are not necessarily related to gender. For Chen, yin and yang primarily involve social relationships, political forms, and weighing advantages and disadvantages. He holds that gender characteristics are too abstract to be practically relevant in this context, and do not apply directly to social forms (Chen 2003).

From a historical perspective, Chen’s interpretation is less convincing than Zhang’s. There are numerous Chinese texts where yin and yang are broadly associated with gender. While yang and yin are not exclusively defined as “male” and “female,” and either sex can be considered yin or yang within a given context, in terms of their most general relation to one another, yin references the female and yang the male. For example, the Daoist text known as the Taipingjing (Scripture of Great Peace) records that “the male and female are the root of yin and yang.” The Han dynasty Confucian thinker Dong Zhongshu (195-115 B.C.E.) also writes, “Yin and yang of the heavens and the earth [which together refer to the cosmos] should be male and female, and the male and female should be yin and yang. Thereby, yin and yang can be called male and female, and male and female can be called yin and yang.” These and other texts draw a strong link between yin as female and yang as male. However, it is important to also recognize that gender itself is not as malleable as yin and yang, despite this connection. While gender remains fixed, their coupling with yin and yang is not. This close and complex relationship means yin and yang themselves require examination if their role in Chinese gender theory is to be properly understood.

The original meaning of yin and yang had little to do with gender differences. Some of the earliest uses of yin and yang are found in the Shangshu (Book of Documents). Here, the word yang is employed six times, and five times it denotes the southern side of mountains, which receives the most sunlight. The term yin appears three times in the text, and refers to the shadier northern side of mountains. These examples are characteristic of how yin and yang function throughout Chinese intellectual history; they do not refer to particular objects, but act as correlative categorizations. In most instances yin and yang are used to indicate a specific relationship within a determined context. The way sunshine falls on a mountain is the context, and the difference between the northern and southern sides, where the latter receives more light and warmth, determines their association, which is understood as yin and yang. The terms are thereby an expression of the function of the sun on a particular place, but they do not speak to the actual substance of the objects (the sun or mountain) themselves. The specific traits of the objects can only be designated yin and yang in their functional correlation to one another. Within this matrix, yin things share commonalities when viewed in relation to yang things.

In this way, the early association of yin and yang with gender can be seen as speaking to the relationship between genders, and not to their essential or substantial natures. Yin and yang traits were thus seen as able to accurately describe broad differences between males and females as they interact with one another. Fixing the link between these categorizations, having men be yang in relation to women, who are yin, only works in a highly abstract or broad sense. For example, the Book of Changes states that the emperor is supposed to have six male ministers at the south palace (a yang position) and six wives or concubines at the north palace (a yin position). Like the southern and northern sides of a mountain, men and women are yang and yin in the way they serve the emperor. Social positions are linked to gender and understood through yin and yang. The Liji (Record of Rituals) states that “the male is outside, and the wife inside the home. The sun starts in the east and the moon starts in the west. This is the distinction of yin and yang, the positions of husband and wife.” However, in specific contexts, it is possible for the association to be reversed. For instance, in Dong Zhongshu’s Chunqiu Fanlu (Spring and Autumn Annals), we also find that “the sovereign is yang, the minister is yin; the father is yang, the son is yin.” Here males, such as ministers or sons, can also be considered yin. The entire pattern can be overturned, as well, such as in the relationship between an empress and her male ministers, where the woman is yang and the men are considered yin. However, such a situation was often considered something that should be approached with caution, as it violated natural patterns. For example, Wang Bi (226-249 C.E.), who did not care much for Dong Zhongshu’s cosmological interpretation, still argued that a woman who was too strong was not to be married.

In terms of actual practice, the more generalized and stable affiliation between yin as female and yang as male often won out, as exemplified by Wang’s idea. It was commonly appropriated as an ideological tool for backing the oppression of women, especially after Dong Zhongshu’s theories took hold. Dong, whose version of Confucianism won imperial backing during the Han dynasty, was also responsible for promoting the official establishment of a formal cosmology based on yin and yang, which became quite influential in the Chinese tradition. While he allows for men to be understood as yin and women as yang in certain contexts, overall he sought to limit the scope of such reversals. For Dong, males are dominant, powerful, and moral, and therefore yang. Women, on the other hand, are precisely the opposite—subservient, weak, selfish, and jealous—and best described as yin. As a result, female virtues became largely oriented toward social roles, especially women’s duties as wives (for example, the female virtues of chastity and compliancy). Against this biased intellectual background, oppressive practices were supported and initiated. For instance, the widespread acceptance of concubinage and female foot binding in Chinese social history expressed the inequality between genders.

However, this social inequality did not accurately reflect its culture’s philosophical thought. Most Chinese thinkers were very attentive to the advantageousness of the complementary nature of male and female characteristics. In fact, in many texts considered Confucian that are predominant for two millennia of Chinese thought, the political system and gender roles are integrated (Yang 2013). This integration is based on understanding yin and yang as fundamentally affixed to gender and thereby permeating all aspects of social life. Sinologists such as Joseph Needham have identified a “feminine symbol” in Chinese culture, rooted in the Daoist concentration on yin. Roger Ames and David Hall similarly argue that yin and yang indicate a “difference in emphasis rather than difference in kind” and should be viewed as a whole, and that therefore their relationship can be likened to that of male and female traits (Ames and Hall 1998: 90-96). Overall, while the complementary understanding of yin and yang did not bring about gender equality in traditional Chinese society, it remains a key factor for comprehending Chinese conceptions of gender. As Robin Wang has noted, “on the one hand, yinyang seems to be an intriguing and valuable conceptual resource in ancient Chinese thought for a balanced account of gender equality; on the other hand, no one can deny the fact that the inhumane treatment of women throughout Chinese history has often been rationalized in the name of yinyang” (Wang 2012: xi).

2. Human Tendencies (Nature) and Gender

Gender issues play an important role in the history of Chinese thought. Many thinkers theorized about the significance of gender in a variety of areas. The precondition for this discussion is an interpretation of xing, “nature” or “tendencies.” The idea of “differences of xing” constitutes the modern term for “gender,” xingbie (literally “tendency differences”) making xing central to this discussion. It should be noted that the Chinese understanding of xing, including “human xing,” is closer to “tendency” or “propensity” than traditional Western conceptions of human “nature.” This is mainly because xing is not seen as something static or unchangeable. (It is for this reason that Ames and Hall, in the quote above, highlight the difference between “emphasis” and “kind.”) The way xing is understood greatly contributes to the way arguments about gender unfold.

The term xing first became an important philosophical concept in discussions about humanity and eventually human tendency, or renxing. In terms of its composition, the character xing is made up of a vertical representation of xin, “heart-mind” (the heart was thought to be the organ responsible for both thoughts and feelings/emotions) on the left side. This complements the character sheng, to the right, which can mean “generation,” “grow,” or “give birth to.” In many cases, the way sheng is understood has a significant impact on interpreting xing and gender. As a noun, sheng can mean “natural life,” which gives rise to theories about “original nature” or “foundational tendencies” (benxing). It thereby connotes vital activities and physiological desires or needs. It is in this sense that Mengzi (372-289 B.C.E.) describes human tendencies (renxing) as desiring to eat and have sex. He also says that form and color are natural characteristics, or natural xing. The Record of Rituals similarly comments that food, drink, and relations between men and women are defining human interests. Xunzi (312-238 B.C.E.), generally regarded as the last great classical Confucian thinker, fundamentally disagreed with Mengzi’s claim that humans naturally tend toward what is good or moral. He did, however, similarly classify xing as the desire for food, warmth, and rest.

Sheng can also be a verb, which gives xing a slightly different connotation. As a verb, sheng indicates creation and growth, and thus supports the suggestion that xing should be understood as human growth through the development of one’s heart-mind, the root, or seat, of human nature or tendencies. The Mengzi expressly refers to this, stating that xing is understood through the heart-mind. This also marks the distinction between humans and animals. A human xing provides specific characteristics and enables a certain orientation for growth that is unique in that it includes a moral dimension. It is in this sense that Mengzi proposes his theory for natural human goodness, a suggestion that Xunzi later rebuts, albeit upon a similar understanding of xing. Texts classified as Daoist, such as the Laozi and Zhuangzi, similarly affirm that xing is what endows beings with their particular virtuousness (though it is not necessarily moral).

It is on the basis of human nature/tendencies that their unique capacity for moral cultivation is given. The Xing Zi Ming Chu (Recipes for Nourishing Life), a 4^th century B.C.E. text recovered from the Guodian archaeological site, comments that human beings are defined by the capacity and desire to learn. Natural human tendencies are thereby not simply inherent, they also need to be grown and refined. The Mengzi argues that learning is nothing more than developing and cultivating aspects of one’s own heart-mind. The Xunzi agrees, adding that too much change or purposeful change can bring about falsity—which often results in immoral thoughts, feelings, or actions. These texts agree in their argument that there are certain natural patterns or processes for each thing, and deviating from these is potentially dangerous. Anything “false” or out of accordance with these patterns is likely to be immoral and harmful to oneself and society, so certain restrictions are placed on human practice to promote moral growth. These discussions look at human tendencies as largely shaped in the context of society, and can be taken as a conceptual basis for understanding gender as a natural tendency that is steered through social institutions. For example, when Mengzi is asked why the ancient sage-ruler Shun lied to his parents in order to marry, Mengzi defends Shun as doing the right thing. Explaining that otherwise Shun would have remained a bachelor, Mengzi writes, “The greatest of human relations is that a man and a woman live together.” Thus Mengzi argues that Shun’s moral character was based on proper cultivation of his natural tendencies according to social mores.

One’s individual nature is largely influenced, and to some extent even generated, by one’s cultural surroundings. This also produces physiological properties that account for a wide variety of characteristics that are then reflected in aspects of gender, culture, and social status. Linked to the understanding of yin and yang as functionally codependent categorizations, differences between genders are characterized on the basis of their distinguishing features, and defined correlatively. This means that behavior and identity largely arise within the context of male-female relations. One’s natural tendencies include gender identity as either xiong xing (male tendencies) or ci xing (female tendencies), which one is supposed to cultivate accordingly. Thus there are more physiological and cultural aspects to human tendencies, as well. In these diverse ways, Chinese philosophy emphasizes the difference between males and females, believing that each has their own particular aspects to offer, which are complementary and can be unified to form a harmonious whole (though this does not necessarily imply their equality).

3. Gender Cosmology

The idea of gender as being fundamentally understood through respective dissimilarities (nan nü you bie) is based in the physiological differences between men and women, but also manifests in philosophic thought. In fact, in one of the earliest references to the distinction between men and women, the Record of Rituals asserts:

Once there is a difference between males and females, then there can be love between fathers and sons. Once there is love between fathers and sons, obligations are generated. Once obligations are generated, rituals are made. Once rituals are made, all things can be at ease.

The original difference between genders is—presumably through the generative power of their combination—the foundation for obligations (or morality) and thus ritual (or social moral patterns), which allows finally for harmony in the cosmos as a whole. Through the establishment of the concept that human tendencies are formed and act in line with nature, Chinese gender cosmology applies an analogous generative model of yin and yang to a general understanding of the world.

Another early text, the 3^rd century B.C.E. medical compendium Huangdi Neijing (Inner Scripture of the Yellow Emperor), offers one of the most comprehensive definitions of yin and yang:

Yin and yang are the dao (“way”) of the heavens and earth, they provide the model for the net (gangji) of all beings, they are the parents of all change and transformation, and the origin of life and death, and the residence for spirit and insight. To heal illness [one] must seek its root. (Zhang 2002: 41)

Here, yin and yang are taken as a pattern embedded in the existence of all beings, thus providing the foundation for a coherent worldview. This weaves together human beings, nature, and dao (way) in a manner that creates a dynamic wholeness pervaded by and mediated through the interaction of yin and yang. This Chinese cosmological view sees all things, including humans, as borne of both yin and yang and thus naturally integrated with one another. In essence, dao represents the interaction between yin and yang, and it is in this respect that the Laozi tells us that dao is both the source and the model, or pattern, for all things (Laozi 25). More directly, the Laozi comments that all things in turn carry yin and embrace yang (Laozi 42). This shows that through yin and yang and their patterns of interaction dao provides the rhythm of the cosmos. From this perspective the genders also complement and nourish one another, and are even vital to one another.

The idea that the interaction of yin and yang generates the myriad things in existence corresponds to intercourse between male and female as the only means for reproducing life. Therefore, the nature of men and women in Chinese philosophy is not only based on purely physiological characteristics and differences, but is also the embodiment of yin and yang forces in gender. The dao of men and women are linked to the dao of the universe in terms of reproducing life. This is systematically discussed in the Book of Changes, one of China’s most ancient and influential texts. There, eight trigrams are given, which represent eight natural phenomena and can further be combined to form sixty-four hexagrams. These are expressions of the function and movement of yin and yang. They are composed of two contrasting symbols: the yang-yao unbroken horizontal line, and the yin-yao broken horizontal line. Some scholars see these as referring to the male and female genitals respectively. In this sense, the first two hexagrams qian or “heaven” (which is six yang-yaos) and kun or “earth” (six yin-yaos) can be interpreted as representing pure yin and yang. They are also responsible for the formation of general gender stereotypes in Chinese thought. They provide the gateways for change, and are considered, quite literally, the father and mother of all other hexagrams (which equates to all things in the world). The broad system of the Book of Changes attempts to explain every type of change and existence, and is built upon an identification of yin and yang with the sexes as well as their interaction with one another.

According to the “Xici Zhuan” (Commentary on the Appended Phrases) section of the Book of Changes, qian is equated with the heavens, yang, power, and creativity, while kun is identified with the earth, yin, receptivity, and preservation. Their interaction generates all things and events in a way that is similar to the intercourse between males and females, bringing about new life. The Commentary on the Appended Phrases makes the link to gender issues clear by stating that both qian and kun have their own daos (ways) that are responsible for the male and female respectively. The text goes on to discuss the interaction between the two, both cosmologically in terms of the heavens and earth and biologically in terms of the sexes. The conclusion is that their combination and interrelation is responsible for all living things and their changes. The intercourse between genders is a harmonization of yin and yang that is necessary not only for an individual’s well-being, but also for the proper functioning of the cosmos. Interaction between genders is thus the primary mechanism of life, which explains all forms of generation, transformation, and existence.

4. Gender and Social Order

Theoretically, the social order of gender in Chinese thought is broadly formed on the concepts of the heavens and earth and yin and yang. When these notions are applied to the social field, they are likened to the male and female genders. In the aforementioned Commentary on the Appended Phrases, heaven and yang are considered honorable, while the earth and yin are seen as lowly in comparison. Since the former are coupled with qian, which comprises maleness, and the latter with kun, which marks femaleness, these gender roles are valued similarly. The Inner Scripture of the Yellow Emperor says that yang’s maleness is meant for the outside, and yin’s femaleness for the inside. Men, being equated here with yang, are also associated with superiority, motion, and firmness, while women are coupled with yin and so seen as inferior, still, and gentle. Gender cosmology then largely replaced more dynamic views of gender roles with sharply defined unequal relationships, and these were generally echoed throughout the culture. The social order that emerged from this thought saw men as largely in charge of external affairs and superior to women.

The specific operational mode for maintaining this social order and its gender distinctions is li, propriety or ritual. The Record of Rituals focuses much of its discourse on specific rules regarding distinct practices reserved for certain individuals through gender categorization. In this way, wedding ceremonies are the root of propriety. Marriage is especially important because it is politically valuable for establishing and sustaining social order through designated male-female relations. In the Record of Rituals, men and women are asked to observe strict separation in society and uphold the distinction between the outer and inner. (Men being responsible for the family’s “outer” dealings, including legal, economic, and political affairs, and women the “inner” ones, such as familial relations and housework.) Social roles were thereby moralized according to gender. The Record of Rituals also tells us that the rites as a couple begin with gender responsibilities. It states, for example, that when outside the home the husband is supposed to lead the way and that the wife should follow. However, within the home women were supposed to obey men as well, even boys. Before marriage, a girl was expected to listen to her father, and then after marriage to be obedient to her husband, or to their sons if he died. These general guidelines are commonly referred to in other texts as the sancong side or “three obediences and four virtues,” which dominated theories of proper social ordering for most of China’s history.

The four virtues—women’s virtue (fude), women’s speech (fuyan), women’s appearance (furong), and women’s work (fugong)—were expounded on by Ban Zhao (45-120 C.E.) in her book Nüjie (Admonitions for Women). She believed that women should be conservative, humble, and quiet in expressing ritual or filial propriety as their virtue. In the same way, a women’s speech should not be “flowery” or persuasive, but yielding and circumspect. She should also pay close attention to her appearance, be clean and proper, and act especially carefully around guests and in public. Her work consists mainly in household practicalities, such as weaving and food preparation.

The sancong (three obediences) can also be regarded as a forerunner to the san gang, or “three cardinal guides,” of the later Han dynasty (25-220 C.E.). The three cardinal guides were put forward by the aforementioned Dong Zhongshu and contributed greatly to integrating yin and yang gender cosmology into the framework of Confucian ethics. These guides are regulations about relationships—they are defined as the ruler guiding ministers, fathers guiding sons, and husbands guiding wives. Although these rules lack specific content, they do provide a general understanding for ordering society that is concentrated on proper relationships, which is the basic element for morality in many Confucian texts. Here a strong gender bias emerges. The partiality shown toward the elevated position of husbands is only further bolstered by the other two relationships being completely male-based. The only time females are mentioned they are last. Moreover, the ranking of the relationships themselves are hierarchical, relegating women to the lowest level of this order.

Dong also elaborated on distinguishing goodness from evil based on elevating things associated with yang and its general characteristics as ultimately superior to yin, and at the same time emphasized their connections to gender characteristics. This further reinforces deep gender bias. The language of Dong’s Spring and Autumn Annals praises males and presents a negative view of females and all things feminine. The text explicitly argues that even if there are ways in which the husband is inferior to the wife, the former is still yang and therefore better overall. Even more drastically, it states that evilness and all things bad belong to yin, while goodness and all things good are associated with yang, which clearly implicitly links good and evil to male and female, respectively. There are places where, due to the interrelated correlative relationship between yin and yang, the female might be yang and therefore superior in certain aspects, but since she is mostly yin, she is always worse overall. The text even goes so far as to require that relationships between men and women be adjusted to strictly conform to the three cardinal guides. Rules require that subjects obey their rulers, children their fathers, and wives their husbands. In Dong’s other writings, he goes a step further, declaring that the three cardinal guides are a mandate of the heavens. This gives cosmological support to his social arrangement, equating male superiority with the natural ordering of all things.

In the Baihutong (Philosophical Discussions in the White Tiger Hall), which is a collection of court debates from the later Han dynasty, discourse on Dong’s guidelines is taken further. During this time, Confucianism was established as the official state ideology and heavily influenced many areas of politics, including court functioning, policies, and education. This, in turn, provided the foundation for a Confucian society in which this ideology successfully penetrated the daily lives of the state’s entire populace. Dong’s interpretation of ancient texts, including his reading of gender cosmology, became especially powerful as Confucianism believes that the basis for social order and morality begins in human interaction, not individuals. In this context, people are mainly understood according to their roles in society or relationships with others, which were already established as naturally hierarchical in the Analects (the record of Confucius’s actions and words). Dong’s work added a distinct favoring of male over female that became increasingly established and widespread as Confucianism became increasingly influential. Conceived of as analogous to the relationship between rulers and ministers, teachers and students, or parents and children, the two sexes were generally assumed to be a natural ordering of the superior and inferior.

Although these sexist trends are not found in earlier texts—at least not explicitly—they became quite common after the Han dynasty. (The most controversial exception to this is in Analects 17:25, where Confucius is recorded to have equated petty people and women; however, it is unclear exactly what he meant, and whether or not he was referring to women in general or just “petty” ones.) By the Song dynasty (960-1279 C.E.), mainstream political and intellectual discourse viewed both the ability and moral character of women as significantly inferior to males. The Confucian classic known as the Shijing (Book of Poetry) includes the controversial line “Male intellect builds states, female intellect topples states” (Zhou 2002: 489), which in the Song dynasty became understood as an argument for keeping women out of politics and state affairs. On this basis, the Neo-Confucian thinker Zhu Xi (1130-1200) criticized Wu Zetian, China’s only female emperor, arguing that failure to observe Dong’s three cardinal guides was ultimately responsible for the chaos, violence, and civil wars that had followed the Tang dynasty (618-907 C.E.). Later, during the Ming dynasty (1368-1644 C.E.), the Confucian thinker Zhang Dai (1597-1679 C.E.) developed the idea that males express virtuousness through their ability to debate and contend with one another, while women find virtuousness in lacking this skill. Although he did not expound much on this idea, it was taken to mean that women were both unable and ought not contend with others, including their husband. Their obedience was a display of morality. Similarly, men were expected to dominate their wives in a somewhat disrespectful manner in order to display their own ethical cultivation. In more extreme interpretations, Zhang’s notion was read as “a woman without talent is virtuous.” This was linked to the cosmological understanding of gender roles so that failure to follow these guides meant the betrayal of natural patterns—the traditional foundation for ethical norms. During this time, imperial law stated that any man over forty without a male heir must take on a concubine to aid him in producing one.

The domination of these views in both culture and philosophy caused the Chinese tradition to attach great importance to hierarchical gender roles. Social order based itself on cosmological theories that were automatically normative and constituted guidelines for moral cultivation. Despite the Book of Changes and Laozi’s emphasis on the importance of the interaction between yin and yang as complementary and mutually constitutive, women were generally regarded as inferior.

5. Family Patterns

Ideal political and social order in the state was regarded as a replication of the family model on a larger scale. The way neighbors interacted, friends treated one another, and ministers served rulers were all based on models of familial relationships. Early Confucian texts provided the ideological foundation for this pattern by arguing that morality must be cultivated at home first before it could be adequately practiced in society. In terms of gender, the hierarchical relationships in socio-political spheres were simply extensions of the superiority of husbands in spousal relations. The Record of Rituals explains, “Just as two rulers cannot coexist in one country, a household cannot have two masters; only one can govern” (Zheng 2008: 2353). Dong Zhongshu’s three cardinal guides promoted this attitude by requiring that wives listen to their husbands in the same way that children should listen to fathers and then further placing the spousal relationship below that of father and son. Zhu Xi bolstered this order by arguing that children should respect both parents, but that the father should be absolutely superior to the mother. Zhu recognized that there were aspects of life, mostly household affairs (nei), that women were well suited for, but saw men’s duties as superior, and therefore advocated that males always dominate females.

In line with the mutual relationship of yin and yang emphasized by the Book of Changes and Daoism, marriages were largely understood as being a deferential equivalence. The wedding rites in the Record of Rituals say that marriages are important for maintaining ancestral sacrifice and family lineages. The text describes that when a groom gives a salute, the bride can sit, and that during the ceremony they should eat at the same table and drink from the same bottle to display their mutual affection, trust, and support. This also aligns the woman, who had no official rank of her own, with her husband’s rank. The Record of Rituals further records that during China’s first dynasties, enlightened monarchs respected their wives and children, and that this is in line with natural order or dao. The Xiaojing (Classic of Filial Piety) also says that rulers should never insult even their concubines, let alone their wives. Although only leaders are mentioned, according to Chinese ethical systems people are supposed to emanate their superiors, so this deference would ideally be practiced in every household. However, such roles were largely based on function. For men this meant learning, working, and carrying on the ancestral line. Women were in charge of household affairs and principally responsible for producing a male heir. If they failed in the latter, their martial function was largely unfulfilled, which reflected poorly on the husband, as well. Since the women’s function was largely mechanistic, her status was much lower and she was essentially anonymous, without independent social standing. Men could take on concubines to produce heirs or simply for pleasure, and while wives were “in charge” of concubines, they could also be (albeit rarely) replaced by them, and would have to serve the sons of concubines if they produced none of their own. Legally, men owned their wives, and there was often little practical recourse for a woman against her husband, even though the laws of certain periods allowed for it.

The Book of Poetry contains a large number of poems and songs describing marriage and love between men and women, some of which express the joys and sorrows of women. The collection includes lamentations of men going off on business or to war, and women’s complaints of being abandoned by their husbands after concubines are purchased. They are meant to remind husbands of social expectations and moral responsibilities. The Lienüzhuan (Biographies of Virtuous Women) and Xunzi both argue that the husband-wife relation is foundational for the family, and therefore for a stable society, as well. (The Zhongyong, or Doctrine of the Mean, adds that the sage’s virtue is found most simply in husband-wife relations.) Liu Xiang (77 B.C.E.-6 C.E.), the complier of the Lienüzhuan, firmly believed that morality starts in the family and reverberates out into society. He grouped virtuous women into six categories, or virtues: maternal rectitude (muyi), sage-like intelligence (xianming), humane wisdom (renzhi), purity and deference (zhenshun), chastity and dutifulness (jieyi), and skill in arguments and communication (biantong). Later editions of this text became less gender specific, but Liu emphasized women who were able to carry out certain female-related duties in role-specific conditions (including those of daughter, wife, daughter-in-law, and mother). Although Liu did not mention it, later texts argued that widows should not remarry or take on lovers. The Neo-Confucian thinker Cheng Yi (1033-1107) was one of the harshest interpreters of widow fidelity, claiming that they should rather starve to death than take on a second husband. Zhu Xi, who disagreed with Cheng on many issues, argued that this was not practical; yet it was generally regarded as virtuous, even if not widely practiced. Cheng’s proposal was also important because he did not restrict such devotion to women, which created a rare sense of equality (of which Zhu also disapproved).

Analogous to yin and yang, the relationship of the wife and “inner” with the husband and “outer” is conceived of as complementary, not dualistic. According to the functional distinction of “inner” and “outer,” women were responsible for everything in the house, while men dominated external affairs. The most basic form of this division was given as “Men plow and women weave” (nan geng nü zhi). However, this distinction is not equivalent to the Western concepts of private and public. In fact, during the Wei-Jin period of national disunity (265-420 C.E.), it was common for women in northern Chinese states to handle family legal matters at court, go out to present gifts, and handle certain business matters. The woman’s role was not always marginalized, but it was focused on specific tasks. Chinese families often believed that educating their daughters well (though not necessarily in literary learning) was the precondition for improving the family and encouraging orderliness. Women were also often the primary caretakers and to some extent educators of all children, male or female—an invaluable role for the entire household. A couple’s shared goals, like obtaining wealth or educating children, were designated into separate spheres that either the wife or husband would control. The third-century B.C.E. philosophical miscellany known as Lüshi Chunqiu (Mr. Lü’s Spring and Autumn Annals) declares that husbands should have clothes to wear without weaving and wives have food to eat without farming because of their division of labor, which allows for a more efficacious family and society. Individual differences should be acknowledged so that the couple can support and assist one another.

6. Chinese Cultural Resources for Feminism

Taking yin and yang as an analogy for female and male, classical Chinese thought presents a complex picture of their interaction. Firstly, with thinkers such as Dong Zhongshu, the split between the two genders can be seen as relatively fixed. On this basis regulations on gender roles are equally stabilized, so that they are considered complementary, but not equal. The second major trend, seen most explicitly in the Laozi, values the inseparability of yin and yang, which is equated with the female and male. This interpretation explores the productive and efficacious nature of yin, or feminine powers. While not necessarily feminist, this latter view provides a robust resource for exploring feminism in Chinese thought. These two orientations were developed along the lines of their respective representatives in Chinese traditions.

Like the relationship between yin and yang, a complementary relationship can be seen between these two views on gender. Thinkers such as Confucius, Mengzi, Xunzi, Dong Zhongshu, and Zhu Xi are often taken to represent Confucianism, which belongs to the first viewpoint. The Laozi and Zhuangzi have then been seen as opposed to these thinkers, and are representative of Daoism. However, the actual relationship between these two “schools” is much more integrated. For example, Wang Bi wrote what is generally regarded as the standard commentary on the Laozi, and yet he considered Confucius to be a higher sage than Laozi. Similarly, actual Chinese social practices cannot be traced back to either Daoism or Confucianism exclusively, though one or the other may be more emphasized in particular cases. Taken as separate, they each highlight different aspects that, when integrated with one another, represent a whole. Although they are sometimes read as opposing views, both are equally indispensable for comprehending Chinese culture and history.

Despite the possibility of reading feminism into many Chinese texts, there can be no doubt that the Chinese tradition, as practiced, was largely sexist. For the most part, the inferior position of women was based on readings (whether or not they were misinterpretations) of texts generally classified as Confucian, such as the Record of Rituals, Book of Poetry, or Analects. On the other hand, other texts regarded as Confucian—such as the Book of Changes or Classic of Filial Piety—harbor rich resources for feminism in China. So while sexist practices are and were frequently defended on the basis of Confucian texts, this is limited to particular passages, and does not speak to the complexity of either Confucianism or Chinese traditions in general.

As a response to dominant practices, the Laozi—regardless of whether it was formed earlier or later than other major texts, such as the Analects—favors notions that counter (but do not necessarily oppose) early social values. While the Record of Rituals and Book of Poetry contain or promote hierarchical interpretations of gender issues, the Laozi clearly promotes nominally feminine characteristics and values. (This puts the Laozi in conflict with some branches of feminism that seek to destroy notions of “female” or gender-oriented traits and tendencies.) While this does not necessarily equate the Laozi with what is now called “feminism,” it does provide Chinese culture with a potential resource for reviving or creating conceptions of femininity in a more positive light.

The major philosophical concept in the Laozi is dao (way). The first chapter of the text claims that the unchanging dao cannot be spoken of, but it does offer clues in the form of a variety of images that appear throughout its eighty-one chapters. Several of the descriptions associate dao with the feminine, maternal, or female “gate.” In this context, dao is given three important connotations. It is responsible for the origin of all things, it is all things, and it provides the patterns that they should follow. The comparison to a woman’s body and its function of generation (sheng) identify dao as feminine, and therefore speak to the power of the female. The Laozi can therefore be read as advocating that female powers and positions are superior to their male counterparts. In modern scholarship, this is frequently noted, and several scholars have attempted to use the Laozi to support Chinese and comparative feminist studies. Images in the text strongly support these investigations.

For example, the text speaks of the gushen, the “spirit of the valley,” which is said to “never die” and is called xuanpin, or “mysterious femininity” (ch. 6). The character for “spirit,” gu, originally meant “generation.” It is identified with sheng (part of the character for gender and tendencies), and its shape is sometimes taken to represent the female genitals. In other places, dao is referred to as the mother and said to have given birth to all things (ch. 52). Contemporary scholars also point out that there are no “male” images or traditionally male traits linked to dao in the Laozi. Dao’s characteristics, such as being “low,” “soft,” and “weak,” are all associated with yin and femininity, thereby forging a strong link between dao and the female.

Yin tendencies are not, however, exclusively valued. The Laozi offers a more balanced view, which is why it can be used as a resource of feminism, but is not necessarily feminist itself. For example, it says that all things come from dao and that they carry the yin and embrace the yang, and that their blending is what produces harmony in the world (ch. 42). Yin is arguably more basic, but is prized for its ability to overcome yang, just as the soft can overcome the hard and stillness can defeat movement. These notions are applied to many aspects of life, including sexual, political, and military examples. These examples revere female traits, arguing that yin should be acknowledged for its numerous strengths, but do not reject the importance of yang.

Taken as a political text, the Laozi argues that the ruler should take on more female than male traits in order to properly govern the world. This is supposed to allow him to remain “still” while others are in motion, ideally self-ordering. Although this confirms the usefulness of female virtue, it is not an argument for it being superior, or even equal to male counterparts. Rather, it demonstrates how female characteristics can be used to promote efficacy.

Given that sexist practices have largely be defended by reference to texts and scholars that self-identify with the Confucian tradition, it is easy to see why contemporary scholars have looked to the Laozi as one of the major sources for constructing Chinese feminism. It is certainly the first major Chinese philosophical text that explicitly promotes a variety of female traits and values, which allows room for feminist consciousness and discourse.

7. References and Further Reading

Ames, Roger T., and David L. Hall. Thinking from the Han: Self, Truth, and Transcendence in Chinese and Western Culture. Albany, NY: State University of New York Press, 1998.
- (This book includes a chapter on gender roles that outlines how the Confucian tradition can be used to establish a foundation for Chinese gender equality.)
Ames, Roger T., and Henry Rosemont Jr., trans. The Analects of Confucius: A Philosophical Translation. New York, NY: Ballantine Books, 1998.
- (An excellent translation of the Confucian Analects.)
Bossler, Beverly. Courtesans, Concubines and the Cult of Female Fidelity: Gender and Social Change in China, 1000–1400. Cambridge, MA: Harvard University Press, 2012.
- (A superb study of how female roles and virtues shaped Chinese family life, politics,and academics.)
Chen, Jiaqi. “Critique of Zhang Xianglong.” Zhejiang Academic Journal 4 (2003): 127–130.
- (This article points out inequalities of gender rules in Chinese philosophy and social systems.)
Moeller, Hans-Georg, trans. Daodejing: A Complete Translation and Commentary. Chicago, IL: Open Court, 2007.
- (Moeller’s commentary is sensitive to feminist interpretations of the Daodejing or Laozi.)
Rosenlee, Li-Hsiang. Confucianism and Women. Albany, NY: State University of New York Press, 2006.
- (A book-length study of gender roles in the Confucian tradition.)
Van Norden, Bryan, trans. Mengzi, with Selections from Traditional Commentaries. Indianapolis, IN: Hackett Publishing, 2008.
- (A masterful translation of the Mengzi with commentaries from traditional Chinese scholars.)
Wang, Robin R. Yinyang: The Way of Heaven and Earth in Chinese Thought and Culture. NY: Cambridge University Press, 2012.
- (The best study of Chinese yinyang theory in English. The text also includes discussions of gender issues.)
Wang, Robin R. Images of Women in Chinese Thought and Culture: Writings from the Pre-Qin Period through the Song Dynasty. Indianapolis, IN: Hackett Publishing, 2003.
- (An excellent resource for gender issues in Chinese thought.)

Author Information

Lijuan Shen
Email: aashen@126.com
Xi’an University of Architecture and Technology
China

and

Paul D’Ambrosio
Email: pauljdambrosio@hotmail.com
East China Normal University
China

Theological Determinism

Theological determinism is the view that God determines every event that occurs in the history of the world. While there is much debate about which prominent historical figures were theological determinists, St. Augustine, Thomas Aquinas, John Calvin, and Gottfried Leibniz all seemed to espouse the view at least at certain points in their illustrious careers. Contemporary theological determinists also appeal to various biblical texts (for example Ephesians 1:11) and confessional creeds (for example the Westminster Confession of Faith) to support their view. While such arguments from authority carry significant weight within the traditions in which they are offered, another form of argument for theological determinism which has broader appeal draws on perfect being theology, or a kind of systematic thinking through the implications of the claim that God is—in the words of St. Anselm—quo maius cogitari non potest: that than which none greater can be conceived. The article below considers three such perfect being arguments for theological determinism, having to do with God’s knowledge of the future, providential governance of creation, and absolute independence. Implications of theological determinism for human freedom and divine responsibility are then discussed.

Reflection on theological determinism is both theoretically interesting and also practically important, especially for the lives of religious believers. On the one hand, for anyone who enjoys a good philosophical puzzle, thinking through the implications of this view offers the opportunity to consider whether various sets of propositions to which people sometimes ascribe—e.g. that God has exhaustive foreknowledge but that some events are not determined, or that God determines all events but that humans are culpable for their own sin—are in fact jointly consistent, and so what sort of systematic metaphysics is possible. On the other hand, whether all events in the world—and, in particular, personally significant events, such as the birth or death of a child, or the gain or loss of employment—are understood to be determined by God or not makes a significant difference to the attitudes that religious believers adopt and the decisions they make in response to such events in their own lives.

Defining Theological Determinism
Arguments for Theological Determinism
Theological Determinism and Human Freedom
Theological Determinism and Divine Responsibility for Evil
References and Further Reading

1. Defining Theological Determinism

As stated above, theological determinism is the view that God determines every event that occurs in the history of the world. What it means for God to determine an event may need some spelling out. Theological determinism is often associated with Calvinist or Reformed theology, and many proponents of Calvinism put their view in terms of the specificity of God’s decree, the efficaciousness of God’s will, or the extent of God’s providential control. John Feinberg, for example, describes his theological determinist position as that view that “God’s decree covers and controls all things” (2001, p. 504), while Paul Helm, another staunch theological determinist of the Calvinist variety, simply says that God’s providence is “extended to all that He has created” (1993, p. 39). The problem with such characterizations is that they are subject to multiple interpretations, some of whom would be affirmed by theological indeterminists. For instance, a theological indeterminist might say that God’s providence extends to all events, or that even undetermined events are controlled or decreed by God in the sense that God foresees them and allows them to occur and realizes His purposes through them.

Thus one might think it better to define theological determinism in terms of divine causation, as Derk Pereboom does when he characterizes his view as “the position that God is the sufficient active cause of everything in creation, whether directly or by way of secondary causes” (2011, p. 262). The problem here is that some thinkers who seem committed to theological determinism deny that God should be considered a cause at all, at least in any univocal sense as creatures are. Herbert McCabe, for instance, maintains that when we act freely, we are not caused to act by anyone or anything other than ourselves (1987, p. 12). This is not because McCabe thinks that our free actions are undetermined by God, but because he thinks that God is not an “existent among others,” as created causes are (1987, p. 14). Thinkers like McCabe sometimes appeal to Thomas Aquinas’ doctrine of analogy in explaining their view. According to this doctrine, as Austin Farrer explains it, God’s providential activity cannot be conceived in causal terms without “degrade[ing] it to the creaturely level and plac[ing] it in the field of interacting causalities”—the results of which can only be “monstrosity and confusion” (1967, p. 62). If the views of such Thomists are to count as versions of theological determinism, then we need a way of spelling out the view in non-causal terms.

Perhaps, then, theological determinism will have to be defined in terms of God’s decree or will or control after all; but if so, these concepts will have to be defined so as to rule out indeterministic interpretations. We might, for instance, take Feinberg’s definition of an “unconditional” decree as one “based on nothing outside of God that move[s] him to choose one thing or another” (2001, p. 527) and then characterize theological determinism as the view that God unconditionally decrees every event that occurs in the history of the world. Such a view would exclude the possibility that God merely permits some events which He foresees will happen in some circumstances but which He does not Himself determine.

2. Arguments for Theological Determinism

a. Divine Foreknowledge

One of the divine attributes that has been appealed to in arguments for theological determinism is God’s knowledge of future events, or (simple) foreknowledge. Numerous biblical passages support the idea that God knows all that the future holds, including the free choices of human beings. For instance, the New Testament records Jesus’ prophesies that Judas will betray him and that Peter will deny him three times; and in the Hebrew Bible, the psalmist declares to God, “In your book were written all the days that were formed for me, when none of them as yet existed” (Psalm 29). Furthermore, if we assume that there are truths about the future to be known (a question discussed below), then exhaustive divine foreknowledge—that is, God’s foreknowledge of every future event—may be thought to follow from considerations of perfect being theology, since to not know some truth would seem to be an imperfection.

But if God knows the future exhaustively, theological determinists argue, then all future events must be determined, directly or indirectly, by God. The reasoning they offer in support of this argument can be considered in two steps. First is the claim that for a future event e to be known at some time t (say, “in the beginning”), e must be determined at or prior to t. Otherwise, there would be no truth about e to be known at t. The second claim is that if all future events are determined from the beginning of time, they must ultimately be so by God, since nothing else existed in the beginning to determine them. This is not to say that God’s knowledge is causal, in the sense that simply by knowing something, God is the cause of that thing. Rather, proponents of this line of reasoning contend that God cannot know a proposition unless it is true; and the proposition that some event will occur cannot be true at some time, unless that event is determined by that time; but then if God knows that some event will occur when nothing but God exists, it must be God Himself who ultimately determines the event’s occurrence.

Various responses to this sort of argument, for the incompatibility of divine foreknowledge and undetermined events, have been offered in the history of theology. One popular reply first made by Boethius is to deny that God knows anything at some time, since God exists outside of time altogether and knows all things from an eternal perspective. Another response, inspired by William of Ockham, is to grant the possibility of temporal divine knowledge but deny that what God foreknows must be determined by God. Alvin Plantinga (1986), for instance, has argued that creatures can have a sort of counterfactual power over God’s past knowledge, such that they make it the case that God knows what they themselves determine.

One final, more radical response to this argument is to deny that God has exhaustive foreknowledge. Defenders of open theism, who take this route, maintain that God leaves some future events undetermined, and so does not know exactly what the future holds. This is not to say that God is not omniscient. Rather, according to some open theists, propositions about undetermined events are simply not true (or false) before those events occur; or, according to others, there are true propositions about undetermined events, but they are in principle unknowable. Either way, open theists maintain that it is not a real limitation on God not to know what it is impossible to know, and so the denial of exhaustive foreknowledge is compatible with the affirmation that God is a supremely perfect being.

None of these responses to the argument for theological determinism just described are without their critics, however. In reply to the Boethian proposal, questions have been raised about the coherence of the claim that God—a personal being who acts—exists altogether outside of time. Furthermore, the appeal to divine eternality may not even solve the problem, since a parallel argument for theological determinism can be constructed on the assumption that God knows timelessly all that the future—considered from our perspective—holds. Likewise, in reply to the Ockhamist solution, some have questioned whether there is any real distinction between counterfactual power over God’s knowledge of the past and the power to bring about the past, the latter of which seems problematic if not impossible. Finally, many philosophers reject the open theist claim that there are propositions about the future that are neither true nor false, since such a claim requires the denial of the widely accepted principle of bivalence. And the alternative open theist view, that there are true propositions about the future that are unknowable by God, seems to call into question divine omniscience. Furthermore, many theists reject open theism as unorthodox and incompatible with divine sovereignty and providential care of creation—an issue to be discussed below.

b. Divine Providence

In addition to attributing to God exhaustive foreknowledge—or knowledge of all that will happen in the future—many theists are also committed to the claim (explicitly or implicitly, in virtue of other things they believe) that God has exhaustive knowledge of counterfactual conditionals, or facts about what would happen if circumstances were different than they in fact are. One famous biblical example of such knowledge is found in the Hebrew Bible, when David consults God about a rumor he has heard:

David said, “O Lord, the God of Israel, your servant has heard that Saul seeks to come to Keilah, to destroy the city on my account. And now, will Saul come down as your servant has heard?…” The Lord said, “He will come down.” Then David said, “Will the men of Keilah surrender me and my men into the hand of Saul?” The Lord said, “They will surrender you.” (1 Samuel 23: 10-12, N.R.S.V.)

Upon hearing this news, David and his men decide to leave Keilah, and thus Saul, learning that David has left, never ends up going there himself, and the men of Keilah never have the chance to surrender David to him. Thus the truths that the Lord revealed to David are of the counterfactual sort: if David had remained in Keilah, Saul would have sought him there; and if Saul had sought him there, the men of Keilah would have surrendered David to Saul.

Some philosophers have argued that exhaustive divine knowledge of such counterfactual conditionals is essential to God’s perfection—in particular, to God’s sovereignty and providential care for creation—and that such knowledge entails theological determinism. The argument has centered on what are called “counterfactuals of freedom,” or those counterfactual conditionals about what a possible created person (who may or may not ever exist) would freely do in a possible circumstance (which may or may not ever occur). The free actions in question are supposed to be libertarian, or those that are not determined, either by a prior state of the world or by God. Luis de Molina considered knowledge of such counterfactuals to be part of God’s scientia media, or middle knowledge, standing in between God’s “natural knowledge,” or knowledge of God’s own nature and the necessary truths that follow from it, and “free knowledge,” or knowledge of God’s will and the contingent truths that follow from it. Molina claimed that, like the propositions included in God’s natural knowledge, counterfactuals of freedom are pre-volitional, or (logically) prior to, and thus independent of, God’s will; though like the propositions included in God’s free knowledge, they are contingent truths.

One way to reconstruct the line of reasoning from divine knowledge of counterfactual conditionals to theological determinism is thus as follows:

If there are any events in the history of the world that are not determined by God, then—contra Molina—God cannot have exhaustive knowledge of counterfactual conditionals.
If God lacks exhaustive knowledge of counterfactual conditionals, then God take risks with creation.
A God who takes risks with creation is not perfect.
Therefore, since God is perfect, God must determine every event in the history of the world.

Robert Adams has argued in favor of the first premise, focusing in particular on the possibility of God’s knowledge of counterfactuals of freedom. Adams contends that for God to know a proposition, it must have a truth-value; but counterfactuals of freedom lack truth-values, since there is nothing that could ground their truth. While the consequent of a conditional may follow from the antecedent by logical or causal necessity, neither sort of necessity can ground the truth of a conditional about how a person would act if placed in a particular circumstance, if that action is undetermined. And features of a person that do not necessitate her action—such as her particular beliefs and desires—cannot ground the truth of counterfactual conditionals about her action, precisely because such features are non-necessitating. Adams suggests that divine foreknowledge may not face the same grounding problem as middle knowledge, since categorical predictions about undetermined events “can be true by corresponding to the actual occurrence of the event that they predict” (1987, p. 80). But in the case of counterfactual conditionals, there may never be actual events to which the propositions correspond.

Supposing Adams is right that middle knowledge is impossible, what would divine providence look like without it, on the assumption that God does not determine some events in the world? One might think that all God really needs to providentially govern the world is foreknowledge. Yet William Hasker has argued “foreknowledge without middle knowledge—simple foreknowledge—does not offer the benefits for the doctrine of providence that its adherents have sought to derive from it” (1989, p. 19). His reasoning, in brief, is that foreknowledge is about what will actually happen in the world God has created, and so will be useless to God in deciding what to create to begin with or how to arrange events throughout history for the benefit of creatures. Consider, for example, the biblical case discussed above, in which David consults God to determine the best strategy for avoiding capture by Saul. If God had only simple foreknowledge and not middle knowledge, then God could only tell David what he would in fact do, and what Saul’s response would in fact be, and not what better or worse outcomes might result from alternative courses of action. Likewise—and perhaps more worrisome—before creating the world, God could not know without middle knowledge whether, if He gave creatures the libertarian freedom to decide whether to enter a loving relationship with Him and their fellow creatures, any of them would indeed choose to do so. Thus, creating a world with such indeterministic events is risky business for God. In contrast, the view in which God determines all events of the world can be considered a risk-free view of providence.

While Hasker goes on to defend the risky view of providence, others have criticized it as inconsistent with divine perfection. Edwin Curley (2003) has argued that it involves a kind of recklessness inconsistent with the providential wisdom and concern for creatures that is supposed to be characteristic of a perfect Creator. Focusing in particular on indeterminism at the level of human action, Curley points out that a God who gave creatures libertarian freedom without knowing how they would use it would run the risk of their destroying themselves and thwarting God’s purposes for creation. Thomas Flint similarly argues for the superiority of the risk-free view of providence by means of a parental analogy. Imagine, he says, that a parent has two options for her child: under Option One, the child may struggle and seem to be in danger, but the parent will “know with certainty that she will freely develop into a good and happy human being who leads a full and satisfying life”; under Option Two, in contrast, the parent will have no idea how things will turn out for the child, and can only hope for the best. Flint says he would, without hesitation, choose Option One, and that the claim that Option Two is preferable is “just short of absurd” (1998, p. 106). Likewise, he suggests, the claim that a risk-taking God is superior to, or even on par with, a risk-avoiding one is incredible.

If the above line of reasoning is correct, then it follows that a supremely perfect God would not create a world in which events were left undetermined. However, the argument has been questioned on a number of points. With respect to Adams’ argument against the possibility of middle knowledge, at least two assumptions are open to doubt. First, it is unclear whether, for a proposition to have truth-value, there must be something that grounds its truth. Francisco Suárez, an early follower of Molina, seemed to question this claim. Richard Gaskin has as well, maintaining that there is nothing that grounds the truth of any proposition, and that to suppose otherwise “is to slide into a substantial and implausible correspondence theory of truth” (1993, pp. 424-425).

Others, granting that true propositions may need grounding, have proposed possible grounds for counterfactuals of freedom. Alvin Plantinga, for instance, has suggested a parallel between counterfactuals of freedom and propositions about past events. He writes: “Suppose… that yesterday I freely performed some action A. What was or is it that grounded or founded my doing so?… Perhaps you will say that what grounds [the truth of the proposition that I did A] is just that in fact I did A” (1985, p. 378). Plantinga responds that the same kind of answer is available in the case of counterfactuals of freedom; for what grounds such truths is the fact that certain people (actual or possible) are such that if they were put in certain circumstances, they would do certain things.

Other theists who accept that God lacks exhaustive knowledge of counterfactual conditionals question whether this entails that God lacks the sort of providential control over creation essential to His perfection. David Hunt has argued, contra Hasker, that simple foreknowledge can in fact give God a “providential advantage,” allowing Him to “secure results” that He would not be able to secure without such knowledge (2009). If with simple foreknowledge God can thus ensure His central purposes for creation, perhaps the charge that theological indeterminism entails risk-taking with respect to less significant outcomes will not have so much sting.

Alternatively, one may argue with open theists that the risky view of providence involves divine virtues such as experimentation, collaboration, responsiveness, and vulnerability, and that it is the only way to secure the great metaphysical and moral value of creatures with libertarian freedom. One way to put this latter point is in terms of Flint’s parental analogy. After noting that he would of course choose (risk-free) Option One if he could, Flint says, “the fact that we don’t have a choice here, that we as parents are stuck with [risky] Option Two, is one of the things that is especially frustrating (and even terrifying) about being a parent” (1998a, p. 106). An open theist convinced of the impossibility of middle knowledge might respond that this must similarly be what is especially frustrating (and even terrifying!) about being God—that Option One is not available, so that if God wants to create persons with libertarian freedom, He must opt for Option Two. But just as a parent still chooses to give birth to a child, so God still chooses to bring into being such creatures, because of their great value.

c. Divine Aseity

A third argument for theological determinism focuses on the divine attribute of aseity. The word aseity which comes from the Latin phrase a se—“from itself”—refers to God’s absolute independence from anything distinct from Himself. While some have taken divine aseity to be the most fundamental feature of our conception of God, others have suggested that it follows from God’s perfection, since to be dependent on another would seem to be an imperfection (Brower 2011). Closely related to the concept of divine aseity is the medieval conception of God as pure act (actus purus). What medieval thinkers meant by saying that God is pure act is that He is always complete in Himself—always “all that He can be.” In contrast, in created beings there is potentiality and passivity, meaning that they are not all that they can be, but can be changed and acted on by others.

On the basis of considerations of God’s aseity and pure actuality, Reginald Garrigou-Lagrange has offered an argument for theological determinism. For, he says, those who maintain that there are some events that God does not determine—for instance, human choices—must posit “a passivity in the pure Act. If the divine causality is not predetermining with regard to our choice… the divine knowledge is fatally determined by it. To wish to limit the universal causality and absolute independence of God, necessarily brings one to place a passivity in Him” (1936, p. 538). To illustrate his point, Garrigou-Lagrange asks us to imagine that when God gives two men grace to fight temptation, one cooperates with this grace while the other does not, but that the difference between their responses is not determined by God. Supposing that God can foreknow the two men’s responses to His grace, theological indeterminists must admit that “the foreknowledge is passive,” just as a person’s knowledge is passive when she is a mere spectator to some event (1936, pp. 538-539). What Garrigou-Lagrange seems to mean by this suggestive phrasing is that God’s intellect would be passive, in the sense that in coming to know what the two men will do, God’s intellect would be acted upon by something outside of it. Garrigou-Lagrange concludes:

God is either determining or determined, there is no other alternative. His knowledge of free conditional futures is measured by things, or else it measures them by reason of the accompanying decree of the divine will. Our salutary choices, as such, in the intimacy of their free determination, depend upon God, or it is He, the sovereignly independent pure Act, who depends upon us. (1936, p. 546)

In response to this argument for theological determinism, Eleonore Stump contends that the dilemma presented by Garrigou-Lagrange—that God either determines or is determined—is a false one, if determination is taken to be equivalent to causation. She offers examples of both divine and human knowledge in which the knower neither determines what she knows, nor is determined by it. On the human side, a person might know that an animal is a substance, but the human obviously does not determine this truth. And (on Thomas Aquinas’ view of human cognition—which Garrigou-Lagrange would presumably accept) neither is the human rendered passive, or determined in her knowledge of this truth, since the human intellect’s operations are active in the process of deriving it, and nothing acts on the intellect “with causal efficacy” in this process. Likewise, on the divine side, God presumably knows of His own existence without determining that He exists; but neither, presumably, is God determined in His knowledge of this truth (2003, pp. 120-121).

One thing to note about the examples offered by Stump—of a human knowing that an animal is a substance, or of God knowing that He exists—is that the truths known are in both cases necessary. One question that a theological determinist might raise is whether, when it comes to knowledge of contingent events, the indeterminist can likewise maintain that the knower neither determines nor is determined by what she knows. While our coming to know necessary truths on the basis of, say, complex mathematical reasoning would seem to be quite an active process, our coming to know contingent truths on the basis of some very clear and distinct perception—say, that we have hands—would seem to be more passive. If this is right, then the theological determinist might maintain that if God’s knowledge of undetermined future events is quasi-perceptual, then God might indeed be rendered passive by such knowledge. Furthermore, even if the theological indeterminist can defend a conception of divine foreknowledge on which God is not determined by some of what He knows, in the sense that He is not caused to know some truths, it is very hard to see how He would not in some sense be dependent on something outside of Himself for that knowledge. The question for theological indeterminists is whether this sense of dependency is compatible with a conception of God as supremely perfect.

3. Theological Determinism and Human Freedom

So far we have considered arguments that theological determinists have put forward in support of their view of divine providence, as well as some objections raised to these arguments. Critics of theological determinism not only object to the positive reasons offered in favor of the view, but also to certain negative implications. One major issue theological determinists must grapple with is how there can be any creaturely freedom in a world in which all events are determined by God. The claim that at least some creatures are both free and responsible for their actions is a central part of traditional Western theisms—Judaism, Christianity, and Islam—and most contemporary theological determinists affirm this claim, though as we will see, some within these traditions dissent from it. Below, several theological deterministic conceptions of human freedom are discussed.

a. Standard Compatibilism

Perhaps the most common conception of free will espoused by theological determinists is the standard compatibilist one: that determinism of any sort—whether theological (that is, determination by God) or natural (that is, determination by antecedent events in accordance with the laws of nature)–does not automatically rule out free will. Theological determinists espousing this view often appeal to secular theories of freedom and arguments for the compatibility of such freedom with natural determinism to support their claim that theological determinism is also compatible with free will. For instance, according to the classic compatibilist position defended by Thomas Hobbes, a person is free to the extent that she finds no impediment to doing what she wants or wills to do.

Contemporary compatibilists, recognizing the limitations of this position—for example that it allows for actions resulting from brainwashing to be free—have offered various refinements, such as that, in addition to being able to do what one wants or wills to do, one must act with sensitivity to certain rational considerations (the reasons-responsive view), or one must have the will one wants to have (the hierarchical model). One proponent of the latter view is Lynn Rudder Baker. According to Baker, “Person S has compatibilist free will for a choice or action if:

1. S wills X,
2. S wants to will X,
3. S wills X because she wants to will X, and
4. S would still have willed X even if she (herself) had known the provenance of her wanting to will X.” (2003, p. 467)

Baker notes that her account is compatibilist in the sense that “a person S’s having free will with respect to an action (or choice) A is compatible with A’s being caused ultimately by factors outside of S’s control.” She makes no distinction, with respect to the question of an agent’s freedom, whether the agent’s action is caused “by God or by natural events” (2003, pp. 460-461). More generally, theological determinists point out that on all such contemporary compatibilist accounts of free will, divine determination does not automatically rule out human freedom, since none of these accounts specifies what must be true of the first causes of human volition and action. This lack of specificity, however, is precisely the problem that incompatibilists—those who hold that determinism of any sort is incompatible with free will—find with the compatibilist position. They reason that if either God or events of the distant past are the ultimate causes of our actions, then our actions are not under our control. The debate between compatibilists and incompatibilists has a long history, and is ongoing. See “Free Will” for a more in-depth summary.

b. Theological-but-not-Natural Compatibilism

While many theological determinists take the standard compatibilist line, some differentiate between natural and theological determinism, and maintain that only the latter is compatible with free will. Defenders of this position, who might be called “theological-but-not-natural-compatibilists,” appeal to a number of differences between theological and natural determinism to support their view. Hugh McCann, for instance, argues that in contrast to the way in which events that we bring about come to pass, “the manner in which our actions come to pass is not one in which God acts upon us or does anything to us” (2005, p. 145). McCann maintains that God’s causing our actions is like an author’s creating the characters of a novel. He writes: “The author of a novel never makes her creatures do something; she only makes them doing it. It is the same between us and God” (2005, p. 146).

McCann should not be interpreted as denying theological determinism here—that is, as saying that God does not determine what creatures do, but only what they are. Rather, he means that, unlike creatures who can only make other creatures do things, God has the unique ability to make creatures themselves; and rather than first bringing creatures into being, and then making them do certain things, God by one and the same act makes creatures doing the things they do. McCann contends that because of such differences between divine and creaturely causation, theological determinism “does not endanger our freedom” as natural determinism does (2005, p. 146).

However, theological compatibilism, like its natural counterpart, has been criticized by standard incompatibilists. One of the most influential arguments for the incompatibility of causal determinism and human freedom—the Consequence argument—relies on the premise that, in a deterministic world, the ultimate causes of our actions are events of the distant past. The reason why this is considered a problem, though, is simply that such causes lie outside of our control. So if the Consequence argument establishes the incompatibility of free will and natural determinism, a parallel argument appealing to the fact that God’s will, taken as a determining cause, likewise lies outside of our control should establish the incompatibility of free will and theological determinism. To put the point differently, it seems that those who hold that God’s determination of our actions is both causal, and compatible with human freedom, ought to be standard compatibilists about determinism and free will, rather than theological-but-not-natural compatibilists, since the differentiating features of natural determining causes pose no additional threat to free will, once one accepts that God’s determining causation is compatible with human freedom.

c. Libertarianism

While the theological determinists described above, who maintain that theological determinism is compatible with human freedom while natural determinism is not, suggest various differences between divine and natural determination, they still recognize God’s determination as a species of causation. As mentioned already, however, some who seem to espouse theological determinism deny that God should be considered a cause at all, at least in any univocal sense as creatures are. Writing in this tradition, Michael Hoonhout applauds Aquinas for intentionally discussing the doctrine of divine providence twice in his Summa Theologiae—first in the context of “the essence of God” and then in the context of “the nature of creation”—in recognition of “two radically different orders of intelligibility.” He maintains that “double affirmations which seemingly contradict each other are to be expected” if we respect the integrity of each order (2002, pp. 4-6).

The seemingly contradictory “double affirmations” to which Hoonhout refers are that God determines everything that occurs in the world, and that humans have a non-deterministic form of freedom. Thus one finds some theologians who seem clearly committed to theological determinism when considering the order of the Creator, speaking of the possibility of libertarian human freedom in the context of the order of creation. Kathyrn Tanner, for instance, maintains a view of divine causation as absolute in terms of both its range (“all inclusive or universally extensive”) and its efficacy (“cannot be hindered, diverted, or otherwise redirected by creatures”). Tanner claims that since “God does not bring about the human agent’s choice by intervening in the created order as some sort of supernatural cause,” one can “still affirm a very strong libertarian version of the human being’s freedom” (1994, pp. 113, 125, 126).

The trouble with such a view, however, is that it seems to face a dilemma. On the one hand, if the way in which God determines events in the world is really nothing like the way creaturely causes do, such that even fundamental concepts like conditional necessity do not apply to the relationship between God’s causal activity and its effects, then, as Thomas Tracy points out (1994), analogy collapses into equivocation, and we are left without any idea of what theological determinism is supposed to mean. On the other hand, if such fundamental concepts do apply to divine causation in something like the way they apply to creaturely causation, then arguments against the compatibility of theological determinism and human freedom must be considered and responded to, rather than simply dismissed as involving a confusion of categories.

d. Hard Determinism

One final position that theological determinists may adopt on the issue of human freedom is the standard incompatibilist one, admitting that determinism of any sort is incompatible with free will and thus that there can be no creaturely freedom. This view, called hard theological determinism, has historically won few adherents, in part because of the centrality of the belief in human freedom to so much civic and religious life. On the civic side, the assumption of free will has been thought to underwrite reactive attitudes such as resentment, indignation, gratitude, and love, and the moral and legal practices of praise and blame, reward and punishment. On the religious side, human freedom has seemed crucial to the logic of divine commandment and judgment, and to such reactive attitudes and practices as guilt, repentance, and forgiveness.

However, some hard theological determinists have challenged such assumptions about the centrality of free will. Derk Pereboom, for instance, has argued that, while theological determinism is not compatible with the basic sense of desert (that is, deserving praise or blame simply because of the moral status of what one has done) it is compatible with judgments of value (for example, that behavior is good or bad), as well as the reactive attitudes and practices which are most central to traditional theism, and which might seem to presuppose basic desert. For instance, a person without free will might still recognize that she has failed to act according to the principles she believes she should live by, and so experience guilt; or, she might resolve to no longer hold another’s past behavior as a reason to remain at odds with him, and so forgive. Pereboom suggests that God’s commanding and judging, rewarding and punishing may serve the moral formation of creatures even without free will, and so may be justified without it. However, some critics have questioned whether such religiously significant attitudes and practices as repentance and the resolution to amend one’s life can really be secured without a sense of either basic desert or the sort of agential control which hard theological determinists deny. Furthermore, even if hard theological determinism is compatible with such attitudes and practices central to theistic traditions, it is another question whether the denial of free will and moral responsibility in the basic-desert sense is itself compatible with the teachings of these religions. One question that remains for hard Christian determinists, for example, is how to make sense of the many New Testament passages that discuss the freedom found in Christ (cf. Galatians 5:1, 2 Corinthians 3:17).

4. Theological Determinism and Divine Responsibility for Evil

Besides explaining how, on their view, humans can be free and responsible for their own actions (or how the denial of human freedom is compatible with traditional theism), theological determinists must also face questions about God’s moral responsibility for the evil in the world that, on their view, He determines. As with the former issue, their responses to the latter are many and varied. Below a number of distinct responses are discussed.

a. Theodicies and Defenses

Some theists attempt to offer a theodicy, or plausible explanation of why God has created a world in which evil exists. Others, uncertain of what God’s actual reasons are, propose instead a defense, or possible explanation. One historic and popular explanation of why evil exists in a world created by God is the free will defense, first proposed by St. Augustine and developed by Alvin Plantinga (1974). According to this defense, the evil we witness in God’s creation is not in fact God’s doing at all, but the result of humans’ misuse of their own freedom: God created humans to live in harmony with Himself and each other, but they freely chose to rebel against God and to sin against one another. Some proponents of this defense extend it to explain natural as well as moral evil, suggesting that all suffering in the world is ultimately due to sinful choices of fallen creatures, some of which lie behind the destructive natural forces of the world. However, the free will defense seems to assume that it was impossible for God both to create free persons and to determine all of their actions, such that they never do evil. In other words, it seems to assume an indeterministic conception of human freedom incompatible with theological determinism. Thus, the traditional free will defense would not seem to be an option for theological determinists.

Some compatibilists have argued, however, that the free will defense need not presuppose an indeterministic conception of human freedom. Jason Turner, for instance, suggests that if “free actions can be determined but must not be dependent on another’s will”—a view he calls “independent compatibilism”—then the free will defense may still be open to theological determinists (2003, p. 131). On independent compatibilism, whether God could create a world with free persons who were determined in their actions and never committed moral evil depends on whether God would create such a world because the persons never committed evil, or for some other reason. Supposing that the reason God would create a world in which persons who were determined in their actions never committed moral evil was indeed because they never committed evil, their actions would be dependent on God’s will, and so would not be free.

While there thus may be some versions of the free will defense open to the theological determinist, such versions require metaphysical assumptions that may seem implausible—for instance, that events in the causal history of an agent’s action occurring before she was even born may determine whether her (determined) actions are free or not, and that whether an event depends on God’s will in a freedom-undermining way depends on what God’s reasons were for causing it. Still, theological determinists may argue that even the traditional indeterministic version of the free will defense is implausible, and that more plausible explanations of evil are available. John Hick, for instance, contends that, given a modern understanding of evolutionary theory, the claim that humans were created perfect and fell from grace is an incredible one. Inspired by the writings of St. Irenaeus, Hick proposes instead the soul-making theodicy, according to which God created imperfect creatures in a world in which they are prone to suffering and sin. He argues that it is not the freedom of creatures, per se, which is so valuable as to outweigh these evils, but rather their development, morally and spiritually, through struggle, suffering, trial and temptation, and the virtuous characters which result from “the investment of costly personal effort” (2010, p. 256). While Hick is himself committed to theological indeterminism, his basic theodicy is compatible with theological determinism as well.

Two other theodicies that theological determinists have adopted likewise focus on the value of development or process. Eleonore Stump has suggested that a world of sin and suffering is “most conducive” to bringing about both humans’ willingness to receive the gift of salvation from God and also their subsequent sanctification (1985, p. 409). While Stump holds that human freedom is incompatible with theological (and natural) determinism, and that receiving the gift of salvation and undergoing the process of sanctification both require free will, Derk Pereboom contends that “no feature of [her] account demands libertarian freedom, nor even a notion of free will of the sort required for moral responsibility… It is sufficient that this change [the turning to God on the occasion of suffering] is seriously valuable, and that it results in more intimate relationship with God” (2015). Marilyn McCord Adams, likewise, has proposed that participating in evil might facilitate creatures’ identification with Christ and union with God (1999). Such work on theodicy has drawn on specifically Christian conceptions of God and the human good, and advanced them in innovative ways. Yet, these proposals raise many questions about the value of process—developing moral character, becoming sanctified, or coming to identify with God—as well as the comparative value of such processes with the disvalue of the sin and suffering that make them possible.

b. Causing vs. Permitting Evil

Even supposing the disvalue of all sin and suffering in the world is outweighed by the value of the moral development of creatures, another concern critics have raised is whether it is morally permissible for God to cause humans to sin in order to realize some good. Peter Byrne, in response to Paul Helm’s deterministic theodicy, asks:

How does it square with the Pauline injunction that one should not do evil that good may come of it? The place of that injunction in traditional moral theology is to set limits to how far we can pursue good by way of doing evil as its precondition. There are some acts that are so heinous that one may not do them for the sake of the bringing about a greater good…. One may not murder that good may come of it. But Helm’s God has precisely planned, purposed, and necessitated acts of murder and instances of other kinds of horrendous wickedness so that good may come of them. (2008, p. 200)

In response, some theological determinists have argued that the difference between God’s causing humans to commit sin for the purpose of realizing some good (the theological determinist’s view), and knowing that humans would sin if they were created in particular circumstances and choosing to create them in those circumstances anyway, for the purpose of realizing some good (the Molinist view), is morally insignificant. Indeed, theological determinists contend, even the open theist’s view, according to which God allows horrendous evil that He could prevent—presumably for the purpose of realizing some good—raises similar questions about God’s moral responsibility for evil. So, they maintain, this concern about divine responsibility should not be a reason to reject theological determinism in favor of such competing views of divine providence.

c. God not a Moral Agent

While some theological determinists offer theodicies or defenses in attempt to demonstrate that there is some actual or possible reason for evil which morally justifies God in creating it, others eschew such explanations altogether. Some argue that they are unnecessary, on the grounds God cannot, in principle, be morally responsible for anything, since He is above or beyond morality altogether. One line of argument for this conclusion is based on the idea that morality depends on God’s will and command, and that God is not Himself subject to the commandments that He establishes. Morality, on this view, only applies to creatures, over which God has ultimate moral authority. One problem facing such a divine command theory of morality is the familiar Euthyphro problem—that if God’s commandments determine the content of morality, then morality is arbitrary, such that what is right might have been wrong and vice versa if God had willed that it be so. Another implication of this argument that many theists find difficult to accept is that, if God cannot in principle be morally blameworthy since He is above morality, then He cannot be morally praiseworthy either.

d. Sin not Blameworthy

An alternative response to the question of how God could not be blameworthy for causing humans to sin is the hard theological determinist one. As discussed above, hard theological determinists maintain that, since God causes all events in creation, humans are not free or morally responsible in the basic desert sense. As Derk Pereboom notes, it follows on this view that since humans are not blameworthy for their actions, God is not the cause of blameworthy actions. Thus, God’s causing human sin is more similar to His causing natural evils, such as animal predation and its associated sufferings, than it is to His causing moral evils, traditionally understood. Since most theists agree that God has control over all such natural forces, the problem of natural evil poses no more difficulty for the theological determinist than for the theological indeterminist. However, this hard deterministic response to the problem of moral evil is compatible with the offering of a theodicy or defense particular to human sin, as well as with the appeal to skeptical theism discussed below.

e. Skeptical Theism

One final response to the problem of evil that theological determinists make is to admit that they are unable to think of reasons that would justify God in creating a world with the sort and extent of evil that we see, but nevertheless to maintain that such an inability should not be taken as good evidence that there is no divine justification for evil. This is the response offered by skeptical theists, so named because of their skepticism about their own ability to discern God’s reasons for creating and governing the world as He does. Several lines of reasoning have been offered for this position, ranging from arguments from analogy, likening the cognitive distance between us and God to that between a very young child and her parents, to arguments focusing on the massive complexity of the causal networks in the world, and our inability to comprehend how actual and possible goods and evils are connected. The view has also been subject to various objections, regarding purported negative implications of the view for theological knowledge and trust in God, and moral deliberation and action. The debate regarding these issues is ongoing, and the interested reader should see Skeptical Theism for more information.

While skeptical theism is a response to the problem of evil available to theological determinists and indeterminists alike, theological determinists who embrace the view must grapple with further issues. Like those offering a theodicy or defense, theological determinists who maintain their justified ignorance of God’s reasons must still come to terms with the fact that, on their view, evil is not merely permitted but determined by God. This would seem to lead to a sort of double-mindedness specifically about the value of moral evil in the world. It is, after all, central to religious practice to strive to see the events in one’s life from God’s perspective, and to value them as God would, in His wisdom and benevolence. Thus, if some horrendous evil—say, severe child abuse—is divinely determined, then one ought to strive to accept, and even embrace it as instrumental to God’s purposes and so for the greater good. Such an attempt, however, would seem to be in serious tension with a teaching central to the traditional theism, that moral evil is opposed by God, and should be opposed by humans as well.

5. References and Further Reading

Adams, Marilyn McCord (1999). Horrendous Evils and the Goodness of God. Ithaca, NY: Cornell University Press.
- Contains proposal that experience of evil might facilitate humans’ identification with Christ and union with God.
Adams, Robert (1987). “Middle Knowledge and the Problem of Evil.” The Virtue of Faith and Other Essays in Philosophical Theology. New York: Oxford University Press.
- Raises grounding objection against the possibility of middle knowledge.
Baker, Lynn Rudder (2003). “Why Christians Should Not Be Libertarians: An Augustinian Challenge.” Faith and Philosophy, Vol. 20 No. 4, pp. 460-478.
- Argues for compatibilism on the basis of tradition, and offers standard compatibilist account of free will.
Basinger, David and Randall Basinger (1986). Predestination and Free Will: Four Views of Divine Sovereignty and Human Freedom. Downers Grove, IL: InterVarsity Press.
- Contains discussion of how embracing theological determinism might shape one’s personal deliberations and decision-making.
Boethius (1969). The Consolation of Philosophy. Trans. V. E. Watts. New York: Penguin Books.
- Contains proposal of divine timelessness as resolution to the problem of divine foreknowledge and human freedom.
Brower, Jeffrey (2011). “Simplicity and Aseity.” The Oxford Handbook of Philosophical Theology. Ed. Flint, Thomas and Michael Rea. Oxford: Oxford University Press.
- Defines aseity and summarizes argument for theological determinism on the basis of aseity.
Byrne, Peter (2008). “Helm’s God and the Authorship of Sin.” Reason, Faith and History: Philosophical Essays for Paul Helm. Ed. M. W. F. Stone. Burlington, VT: Ashgate.
- Raises concern that Helm’s theological determinism commits him to the claim that God “plans, purposes, and values moral evil.”
Curley, Edwin (2003). “The Incoherence of Christian Theism.” The Harvard Review of Philosophy, Vol. 11, pp. 74-100.
- Contains argument that the risky view of providence is incompatible with divine wisdom and care for creation.
Farrer, Austin (1967). Faith and Speculation. London: A. and C. Black.
- Explicates the doctrine of analogy and its implications for the “paradox” of divine agency and human freedom.
Feinberg, John S. (2001). No One Like Him. Wheaton, IL: Crossway Books.
- Defends theological determinism on biblical, theological, and philosophical grounds, and responds to a number of objections to the view.
Flint, Thomas (1998). Divine Providence: The Molinist Account. Ithaca, NY: Cornell University Press.
- Contains argument for superiority of the risk-free over the risky view of providence.
Gaskin, Richard (1993). “Conditionals of Freedom and Middle Knowledge.” The Philosophical Quarterly, Vol. 43, No. 173, pp. 412-430.
- Argues against claim that counterfactuals of freedom need grounds.
Garrigou-Lagrange, R. (1936). God, His Existence and His Nature: A Thomistic Solution of Certain Agnostic Antinomies, Vol. 2. Trans. Rose, Dom Bebe. London: B. Herder Book Co.
- Contains argument for theological determinism on the basis of God’s aseity.
Hasker, William (1985). “Foreknowledge and Necessity,” Faith and Philosophy, Vol. 2 No. 2, pp. 121-156.
- Criticizes Plantinga’s distinction between counterfactual power over the past and the power to bring about the past.
Hasker, William (1989). God, Time and Knowledge. Ithaca, NY: Cornell University Press.
- Contains argument that simple foreknowledge is providentially useless to God.
Helm, Paul (1993). The Providence of God. Downers Grove, IL: InterVarsity Press.
- Contains arguments for the “risk-free” view of providence on the basis of divine knowledge and goodness.
Hick, John (2010). Evil and the God of Love. New York: Harper and Row.
- Contains explication and defense of the soul-making theodicy.
Hoonhout, Michael (2002). “Grounding Providence in the Theology of the Creator: The Exemplarity of Thomas Aquinas.” The Heythrop Journal, Vol. 43, No. 1, pp. 1-19.
- Defends Aquinas’ seemingly contradictory “double affirmations” of divine causation and human freedom.
Hunt, David (2009). “The Providential Advantage of Divine Foreknowledge.” Arguing about Religion. Ed. Timpe, Kevin. New York: Routledge, pp. 374-385.
- Argues that simple foreknowledge enables God to secure results that He would not be able to secure without it.
McCann, Hugh (2005). “The Author of Sin?” Faith and Philosophy Vol. 22. No. 2, pp. 144-159.
- Argues that theological determinism does not endanger human freedom, as natural determinism does, and that God cannot do moral wrong, since morality is grounded in divine commands.
Pereboom, Derk (2011). “Theological Determinism and Divine Providence.” Molinism: The Contemporary Debate. Ed. Ken Perszyk. Oxford: Oxford University Press, pp. 262-280.
- Defends compatibility of hard theological determinism and traditional theism.
Pereboom, Derk (2015). “Libertarianism and Theological Determinism.” Free Will and Theism: Connections, Contingencies, and Concerns. Ed. Timpe, Kevin and Dan Speak. Under contract with Oxford University Press.
- Offers response to the problem of evil compatible with hard theological determinism.
Plantinga, Alvin (1974). God, Freedom, and Evil. Grand Rapids, MI: Eerdmans.
- Develops a free will defense.
Plantinga, Alvin (1985). “Reply to Robert M. Adams.” Alvin Plantinga (Profiles. Vol. 5). Ed. Tomberlin, James and Peter van Inwagen. Dordrecht: D. Reidel, pp. 371-382.
- Contains proposal of possible grounds for counterfactuals of freedom.
Plantinga, Alvin (1986). “On Ockham’s Way Out.” Faith and Philosophy, Vol. 3 No. 3, pp. 235–269.
- Defends claim that humans have counterfactual power over God’s past knowledge.
Rogers, Katherin (2000). Perfect Being Theology. Edinburgh: Edinburgh University Press.
- Considers implications of the description of God as “that than which none greater can be conceived.”
Stump, Eleonore (1985). “The Problem of Evil.” Faith and Philosophy Vol. 2 No. 4, pp. 392-423.
- Contains proposal that sin and suffering facilitate human acceptance of saving grace and process of sanctification.
Stump, Eleonore (2003). Aquinas. New York: Routledge.
- Contains response to argument for theological determinism on the basis of divine aseity.
Tanner, Kathryn (1994). “Human Freedom, Human Sin, and God the Creator.” The God Who Acts: Philosophical and Theological Explorations. Ed. Thomas Tracy. University Park: Pennsylvania State University Press, pp. 111-135.
- Argues for the compatibility of universal divine causation and libertarian human freedom.
Tracy, Thomas (1994). “Divine Action, Created Causes, and Human Freedom.” The God Who Acts: Philosophical and Theological Explorations. Ed. Thomas Tracy. University Park: Pennsylvania State University Press, pp. 77-102.
- Contains critique of attempt to hold together theological determinism and libertarian human freedom.
Turner, Jason (2013). “Compatibilism and the Free Will Defense.” Faith and Philosophy. Vol. 30, No. 2, pp. 125-137.
- Offers version of free will defense compatible with theological determinism.
Vicens, Leigh (2012). “Divine Determinism, Human Freedom, and the Consequence Argument.” International Journal for Philosophy of Religion, 71:2, pp. 145-155.
- Argues that if natural determinism is incompatible with human freedom, so is theological determinism.
Zagzebski, Linda (2011). “Eternity and Fatalism.” God, Eternity, and Time. Ed. Christian Tapp. Aldershot: Ashgate Press.
- Argues that appeals to divine timelessness do not solve the problem of how divine foreknowledge is compatible with our ability to do otherwise. A parallel point can be made about the problem of how divine foreknowledge is compatible with indeterminism.

Author Information

Leigh Vicens
Email: lvicens@augie.edu
Augustana College
U. S. A.

Locke: Ethics

The major writings of John Locke (1632–1704) are among the most important texts for understanding some of the central currents in epistemology, metaphysics, politics, religion, and pedagogy in the late 17^th and early 18^th century in Western Europe. His magnum opus, An Essay Concerning Human Understanding (1689) is the undeniable starting point for the study of empiricism in the early modern period. Locke’s best-known political text, Two Treatises of Government (1693) criticizes the political system according to which kings rule by divine right (First Treatise) and lays the foundation for modern liberalism (Second Treatise). His Letter Concerning Toleration (1689) argues that much civil unrest is borne of the state trying to prevent the practice of different religions. In this text, Locke suggests that the proper domain of government does not include deciding which religious path the people ought to take for salvation—in short, it is an argument for the separation of church and state. Some Thoughts Concerning Education (1693) is a very influential text in early modern Europe that outlines the best way to rear children. It suggests that the virtue of a person is directly related to the habits of body and the habits of mind instilled in them by their educators.

Although these texts enjoy a status of “must-reads,” Locke’s views on ethics or moral philosophy have nowhere near the same high status. The reason for this is, in large part, that Locke never wrote a text devoted to the topic. This omission is surprising given that several of his friends entreated him to set down his thoughts about ethics. They saw that the scattered remarks that Locke makes about morality here and there throughout his works were, at times, quite provocative and in need of further development and defense. But, for reasons unknown to us, Locke never indulged his friends with a more systematic moral philosophy. It is thus up to his readers to stitch together his fragmented remarks about happiness, moral laws, freedom, and virtue in order to see what kind of moral philosophy is woven through the texts and to determine whether it is a coherent position.

Introduction
The Good
1. Pleasure and Pain
2. Happiness
The Law of Nature
Power, Freedom, and Suspending Desire
Living the Moral Life
References and Further Reading

1. Introduction

While Locke did not write a treatise devoted to a discussion of ethics, there are strands of discussion of morality that weave through many, if not most, of his works. One such strand is evident near the end of his An Essay Concerning Human Understanding (hereafter: Essay) where he states that one of the most important aspects of improving our knowledge is to recognize the kinds of things that we can truly know. With this recognition, he says, we are able to finely-tune the focus of our enquiries for optimal results. And, he concludes, given the natural capacities of human beings, “Morality is the proper Science, and Business of Mankind in general” because human beings are both “concerned” and “fitted to search out their Summum Bonum [highest good]” (Essay, Book IV, chapter xii, section 11; hereafter: Essay, IV.xii.11). This claim indicates that Locke takes the investigation of morality to be of utmost importance and gives us good reason to think that Locke’s analysis of the workings of human understanding in general is intimately connected to discovering how the science proper to humankind is to be practiced. The content of the knowledge of ethics includes information about what we, as rational and voluntary agents, ought to do in order to obtain an end, in particular, the end of happiness. It is the science, Locke says, of using the powers that we have as human beings in order to act in such a way that we obtain things that are good and useful for us. As he says: ethics is “the seeking out those Rules, and Measures of humane Actions, which lead to Happiness, and the Means to practice them” (Essay, IV.xxi.3). So, there are several elements in the landscape of Locke’s ethics: happiness or the highest good as the end of human action; the rules that govern human action; the powers that command human action; and the ways and means by which the rules are practiced. While Locke lays out this conception of ethics in the Essay, not all aspects of his definition are explored in detail in that text. So, in order to get the full picture of how he understands each element of his description of ethics, we must often look to several different texts where they receive a fuller treatment. This means that Locke himself does not explain how these elements fit together leaving his overarching theory somewhat of a puzzle for future commentators to contemplate. But, by mining different texts in this way, we can piece together the details of an ethical theory that, while not always obviously coherent, presents a depth and complexity that, at minimum, confirms that this is a puzzle worth trying to solve.

2. The Good

a. Pleasure and Pain

The thread of moral discussion that weaves most consistently throughout the Essay is the subject of happiness. True happiness, on Locke’s account, is associated with the good, which in turn is associated with pleasure. Pleasure, in its turn, is taken by Locke to be the sole motive for human action. This means that the moral theory that is most directly endorsed in the Essay is hedonism.

On Locke’s view, ideas come to us by two means: sensation and reflection. This view is the cornerstone of his empiricism. According to this theory, there is no such thing as innate ideas or ideas that are inborn in the human mind. All ideas come to us by experience. Locke describes sensation as the “great source” of all our ideas and as wholly dependent on the contact between our sensory organs and the external world. The other source of ideas, reflection or “internal sense,” is dependent on the mind’s reflecting on its own operations, in particular the “satisfaction or uneasiness arising from any thought” (Essay, II.i.4). What’s more, Locke states that pleasure and pain are joined to almost all of our ideas both of sensation and of reflection (Essay, II.vii.2). This means that our mental content is organized, at least in one way, by ideas that are associated with pleasure and ideas that are associated with pain. That our ideas are associated with pains and pleasures seems compatible with our phenomenal experience: the contact between the sense organ of touch and a hot stove will result in an idea of the hot stove annexed by the idea of pain, or the act of remembering a romantic first kiss brings with it the idea of pleasure. And, Locke adds, it makes sense to join our ideas to the ideas of pleasure and pain because if our ideas were not joined with either pleasure of pain, we would have no reason to prefer the doing of one action over another, or the consideration of one idea over another. If this were our situation, we would have no reason to act—either physically or mentally (Essay, II.vii.3). That pleasure and pain are given this motivational role in action entails that Locke endorses hedonism: the pursuit of pleasure and the avoidance of pain are the sole motives for action.

Locke notes that among all the ideas that we receive by sensation and reflection, pleasure and pain are very important. And, he notes that the things that we describe as evil are no more than the things that are annexed to the idea of pain, and the things that we describe as good are no more than the things that are annexed to the idea of pleasure. In other words, the presence of good or evil is nothing other than the way a particular idea relates to us—either pleasurably or painfully. This means that on Locke’s view, good is just the category of things that tend to cause or increase pleasure or decrease pain in us, and evil is just the category of things that tend to cause or increase pain or decrease pleasure in us (Essay, II.xx.2). Now, we might think that, morally speaking, this way of defining good and evil gets Locke into trouble. Consider the following scenario. Smith enjoys breaking her promises. In other words, failing to honor her word brings her pleasure. According to the view just described, it seems that breaking promises, at least for Smith, is a good. For, if good and evil are defined as nothing more than pleasure and pain, it seems that if something gives Smith pleasure, it is impossible to deny that it is a good. This would be an unwelcome effect of Locke’s view, for it would indicate that his system leads directly to a kind of moral relativism. If promise breaking is pleasurable for Smith and promise keeping is pleasurable for her friend Jones and pleasure is the sign of the good, then it seems that the good is relative and there is no sense in which we can say that Jones is right about what is good and Smith is wrong. Locke blocks this kind of consequence for his view by introducing a distinction between “happiness” and “true happiness.” This indicates that while all things that bring us pleasure are linked to happiness, there is also a category of pleasure-bringing things that are linked to true happiness. It is the pursuit of the members of this special category of pleasurable things that is, for Locke, emblematic of the correct use of our intellectual powers.

b. Happiness

Locke is very clear—we all constantly desire happiness. All of our actions, on his view, are oriented towards securing happiness. Uneasiness, Locke’s technical term for being in a state of pain and desirous of some absent good, is the motive that moves us to act in the way that is expected to relieve the pain of desire and secure the state of happiness (Essay, II.xxi.36). But, while Locke equates pleasure with good, he is careful to distinguish the happiness that is acquired as a result of the satisfaction of any particular desire and the true happiness that is the result of the satisfaction of a particular kind of desire. Drawing this distinction allows Locke to hold that the pursuit of a certain sets of pleasures or goods is more worthy than the pursuit of others.

The pursuit of true happiness, according to Locke, is equated with “the highest perfection of intellectual nature” (Essay, II.xxi.51). And, indeed, Locke takes our pursuit of this true happiness to be the thing to which the vast majority of our efforts should be oriented. To do this, he says that we need to try to match our desires to “the true instrinsick good” that is really within things. Notice here that Locke is implying that there is distinction to be drawn between the “true intrinsic good” of a thing and, it seems, the good that we unreflectively take to be within a certain thing. The idea here is that attentively considering a particular thing will allow us to see its true value as opposed to the superficial value we assign to a thing based on our immediate reaction to it. We can think, for example, of a bitter tasting medicine. A face-value assessment of the medicine will lead us to evaluate that the thing is to be avoided. However, more information and contemplation of it will lead us to see that the true worth of the medicine is, in fact, high and so it should be evaluated as a good to be pursued. And, Locke states, if we contemplate a thing long enough, and see clearly the measure of its true worth; we can change our desire and uneasiness for it in proportion to that worth (Essay, II.xxi.53). But how are we to understand Locke’s suggestion that there is a true, intrinsic good in things? So far, all he has said about the good is that it is tracked by pleasure. We begin to get an answer to this question when Locke acknowledges the obvious fact that different people derive pleasure and pain from different things. While he reiterates that happiness is no more than the possession of those things that give the most pleasure and the absence of those things that cause the most pain, and that the objects in these two categories can vary widely among people, he adds the following provocative statement:

If therefore Men in this Life only have hope; if in this Life they can only enjoy, ’tis not strange, nor unreasonable, that they should seek their Happiness by avoiding all things, that disease them here, and by pursuing all that delight them; wherein it will be no wonder to find variety and difference. For if there be no Prospect beyond the Grave, the inference is certainly right, Let us eat and drink, let us enjoy what we delight in, for tomorrow we shall die [Isa, 22:13; I Cor. 15:32]. (Essay, II.xxi.55)

Here, Locke suggests that pursuing and avoiding the particular things that give us pleasure or pain would be a perfectly acceptable way to live were there “no prospect beyond the grave.” It seems that what Locke means is that if there were no judgment day, which is to say that if our actions were not ultimately judged by God, there would be no reason to do otherwise than to blindly follow our pleasures and flee our pains. Now, given this suggestion, the question, then, is how to distinguish between the things that are pleasurable but that will not help our case on judgment day, and those that will. Locke provides a clue for how to do such a thing when he says that the will is typically determined by those things that are judged to be good by the understanding. However, in many cases we use “wrong measures of good and evil” and end by judging unworthy things to be good. He who makes such a mistake errs because “[t]he eternal Law and Nature of things must not be alter’d to comply with his ill order’d choice” (Essay, II.xxi.56). In other words, there is an ordered way to choose which things to pursue—the things that are in accordance with the eternal law and nature of things—and an ill-ordered way, in accordance with our own palates. This indicates that Locke takes there to be a fixed law that determines which things are worthy of our pursuit, and which are not. This means that Locke takes there to be an important distinction between the good, understood as all objects that are connected to pleasure and the moral good, understood as objects connected to pleasure which are also in conformity with a law. Though the distinctions between good and moral good, and between evil and moral evil are not discussed in any great detail by Locke, he does states that moral good and evil is nothing other than the “Conformity or Disagreement of our voluntary Actions to some Law.” Locke states punishments and rewards are bestowed on us for our following or failure to follow this law by “the Will and Power of the Law-maker” (Essay, II.xxviii.5). So, Locke affirms that moral good and evil are closely tied to the observance or violation of some law, and that the lawmaker has the power to reward or punish those who adhere to or stray from the law.

3. The Law of Nature

a. Existence

In the Essay, the concepts of laws and lawmakers do not receive much treatment beyond Locke’s affirmation that God has decreed laws and that there are rewards and punishments associated with the respect or violation of these laws (Essay, I.iii.6; I.iii.12; II.xxi.70; II.xxviii.6). The two most important questions concerning the role of laws in a system of ethics remain unanswered in the Essay: (1) how do we determine the content of the law? This is the epistemological question. And (2) what kind of authority does the law have to obligate? This is the moral question. Locke spends much time considering these questions in a series of nine essays written some thirty years before the Essay, which are known under the collected title Essays on the Law of Nature (hereafter: Law).

The first essay in the series treats the question of whether there is a “rule of morals, or law of nature given to us.” The answer is unequivocally “yes” (Law, Essay I, page 109; hereafter: Law, I: 109). The reason for this positive answer, in short, is because God exists. Locke appeals to a kind of teleological argument to support the claim of God’s existence, saying that given the organization of the universe, including the organized way in which animal and vegetable bodies propagate, there must be a governing principle that is responsible for the patterns we see on earth. And, if we extend this principle to the existence of human life, Locke claims that it is reasonable to believe that there is a pattern or a law that governs behavior. This law is to be understood as moral good or virtue and, Locke states, it is the decree of God’s will and is discernable by “the light of nature.” Because the law tells us what is and is not in conformity with “rational nature,” it has the status of commanding or prohibiting certain behaviors (Law, I: 111; see also Essay, IV.xix.16). Because all human beings possess, by nature, the faculty of reason, all human beings, at least in principle, can discover the natural law.

Locke offers five reasons for thinking that such a natural law exists. He begins by noting that it is evident that there is some disagreement among people about the content of the law. However, far from thinking that such disagreement casts doubt on the existence of the law, he takes the presence of disagreement about the law as evidence that such a true and objective law exists. Disagreements about the content of the law confirm that everyone is in agreement about the fundamental character of the law—that there are things that are by their nature good or evil—but just disagree about how to interpret the law (Law, I: 115). The existence of the law is further reinforced by the fact that we often pass judgment on our own actions, by way of our conscience, leading to feelings of guilt or pride. Because it is not possible, according to Locke, to pronounce a judgment without the existence of a law, the act of conscience demonstrates that such a natural law exists. Third, again appealing to a kind of teleological argument, Locke states that we see that laws govern all manner of natural operations and that it makes sense that human beings would also be governed by laws that are in accordance with their nature (Law, I: 117). Fourth, Locke states that without the natural law, society would not be able to run the way that it does. He suggests that the force of civil law is grounded on the natural law. In other words, without the natural law, positive law would have no moral authority. Elsewhere, Locke underlines this point by saying that given that the law of nature is the eternal rule for all men, the rules made by legislators must conform to this law (The Two Treatises of Government, Treatise II, section 135, hereafter: Government, II.35). Finally, on Locke’s view, there would be no virtue or vice, no reward or punishment, no guilt, if there were no natural law (Law, I: 119). Without the natural law, there would be no bounds on human action. This means that we would be motivated only to do what seems pleasurable and there would be no sense in which anyone could be considered virtuous or vicious. The existence of the natural law, then, allows us to be sensitive to the fact that there are certain pleasures that are more in line with what is objectively right. Indeed, Locke also gestures towards, but does not elaborate on, this kind of thought in the Essay. He suggests that the studious man, who takes all his pleasures from reading and learning will eventually be unable to ignore his desires for food and drink. Likewise, the “Epicure,” whose only interest is in the sensory pleasures of food and drink, will eventually turn his attention to study when shame or the desire to “recommend himself to his Mistress” will raise his uneasiness for knowledge (Essay, II.xxi.43).

So, Locke has given us five reasons to accept the existence of the law of nature that grounds virtuous and vicious behavior. We turn now to how he thinks we come to know the content of the law.

b. Content

Locke suggests that there are two ways to determine the content of the law of nature: by the light of nature and by sense experience.

Locke is careful to note that by “light of nature” he does not mean something like an “inward light” that is “implanted in man” and like a compass constantly leads human beings towards virtue. Rather, this light is to be understood as a kind of metaphor that indicates that truth can be attained by each of us individually by nothing more than the exercise of reason and the intellectual faculties (Law, II: 123). Locke uses a comparison to precious metal mining to make this point clear. He acknowledges that some might say that his explanation of the discovery of the content of the law by the light of nature entails that everyone should always be in possession of the knowledge of this content. But, he notes, this is to take the light of nature as something that is stamped on the hearts on human beings, which is a mistake (see Law, III, 137-145). While the depths of the earth might contain veins of gold and silver, Locke says, this does not mean that everyone living on the stretch of land above those veins is rich (Law, II: 135). Work must be done to dig out the precious metals in order to benefit from their value. Similarly, proper use must be made of the faculties we have in order to benefit from the certainty provided by the light of nature. Locke notes that we can come to know the law of nature, in a way, by tradition, which is to say by the testimony and instruction of other people. But it is a mistake to follow the law for any reason other than that we recognize its universal binding force. This can only be done by our own intellectual investigation (Law, II: 129).

But what, exactly, is the light of nature? Locke acknowledges that it is difficult to answer this question—it is not something stamped on the heart or mind, nor is it something that is exclusively learned by tradition or testimony. The only option left for describing it, then, is that it is something acquired or experienced by sense experience or by reason. And, indeed, Locke suggests that when these two faculties, reason and sensation, work together, nothing can remain obscure to the mind. Sensation provides the mind with ideas and reason guides the faculty of sensation and arranges “together the images of things derived from sense-perception, thence forming others [ideas] and composing new ones” (Law, IV: 147). Locke emphasizes that reason ought to be taken to mean “the discursive faculty of the mind, which advances from things known to thinks unknown,” using as its foundation the data provided by sense experience (Law, IV: 149).

When directly addressing the question of how the combination of reason and sense experience allow us to know the content of the law of nature, Locke states that two important truths must be acknowledged because they are “presupposed in the knowledge of any and every law” (Law, IV: 151). First, we must understand that there is a lawmaker who decreed the law, and that the lawmaker is rightly obeyed as a superior power (a discussion of this point is also found in Government, I.81). Second, we must understand that the lawmaker wishes those to whom the law is decreed to follow the law. Let us take each of these in turn.

Sense experience allows us to know that a lawmaker exists. To demonstrate this, Locke appeals, once again, to a kind of teleological argument: by our senses we come to know the objects external world and, importantly, the regularities with which they move and change. We also see that we human beings are part of the movements and changes of the external world. Reason, then, contemplates these regularities and orders of change and motion and naturally comes to inquire about their origin. The conclusion of such an inquiry, states Locke, is that a powerful and wise creator exists. This conclusion follows from two observations: (1) that beasts and inanimate things cannot be the cause of the existence of human beings because they are clearly less perfect than human beings, and something less perfect cannot bring more perfect things into existence, and 2) that we ourselves cannot be the cause of our own existence because if we possessed the power to create ourselves, we would also have the power to give ourselves eternal life. Because it is obviously the case that we do not have eternal life, Locke concludes that we cannot be the origin of our own existence. So, Locke says, there must be a powerful agent, God, who is the origin of our existence (Law, IV: 153). The senses provide the data from the external world, and reason contemplates the data and concludes that a creator of the observed objects and phenomena must exist. Once the existence of a creator is determined, Locke thinks that we can also see that the creator has “a just and inevitable command over us and at His pleasure can raise us up or throw us down, and make us by the same commanding power happy or miserable” (Law, IV: 155). This commanding power, on Locke’s view, indicates that we are necessarily subject to the decrees of God’s will. (A similar line of discussion is found in Locke’s The Reasonableness of Christianity, 144–46.)

As for the second truth, that the lawmaker, God, wishes us to follow the laws decreed, Locke states that once we see that there is a creator of all things and that an order obtains among them, we see that the creator is both powerful and wise. It follows from these evident attributes that God would not create something without a purpose. Moreover, we notice that our minds and bodies seem well equipped for action, which suggests, “God intends man to do something.” And, the “something” that we are made to do, according to Locke, is the same purpose shared by all created things—the glorification of God (Law, IV: 157). In the case of rational beings, Locke states that given our nature, our function is to use sense experience and reason in order to discover, contemplate, and praise God’s creation; to create a society with other people and to work to maintain and preserve both oneself and the community. And this, in fact, is the content of the law of nature—to preserve one’s own being and to work to maintain and preserve the beings of the other people in our community. This injunction to preserve oneself and to preserve one’s neighbors is also endorsed and stressed throughout Locke’s discussions of political power and freedom (see Government, I.86, 88, 120; II.6, 25, 128).

c. Authority

Once we have knowledge of the content of the law of nature, we must determine from where it derives its authority. In other words, we must ask why we are bound to follow the law once we are aware of its content. Locke begins this discussion by reiterating that the law of nature “is the care and preservation of oneself.” Given this law, he states that virtue should not be understood as a duty but rather the “convenience” of human beings. In this sense, the good is nothing more than what is useful. Further, he adds, the observance of this law is not so much an obligation but rather “a privilege and an advantage, to which we are led by expediency” (Law, VI: 181). This indicates that Locke thinks that actions that are in conformity with the law are useful and practical. In other words, it is in our best interest to follow the law. While this characterization of why we in fact follow the law is compelling, there is nevertheless still an inquiry to be made into why we ought to follow the law.

Locke begins his treatment of this question by stating that no one can oblige us to do anything unless the one who obliges has some superior right and power over us. The obligation that is generated between such a superior power and those who are subject to it results in two kinds of duties: (1) the duty to pay obedience to the command of the superior power. Because our faculties are suited to discover the existence of the divine lawmaker, Locke takes it to be impossible to avoid this discovery, barring some damage or impediment to our faculties. This duty is ultimately grounded in God’s will as the force by which we were created (Law, VI: 183). (2) The duty to suffer punishment as a result of the failure to honor the first duty—obedience. Now, it might seem odd that it would be necessary to postulate that punishment results from the failure to respect a law the content of which is only that we must take care of ourselves. In other words, how could anyone express so little interest in taking care of himself or herself that the fear of punishment is needed to motivate the actions necessary for such care? It is worth quoting Locke’s answer in full:

[A] liability to punishment, which arises from a failure to pay dutiful obedience, so that those who refuse to be led by reason and to own that in the matter of morals and right conduct they are subject to a superior authority may recognize that they are constrained by force and punishment to be submissive to that authority and feel the strength of Him whose will they refuse to follow. And so the force of this obligation seems to be grounded in the authority of a lawmaker, so that power compels those who cannot be moved by warnings. (Law, VI: 183)

So, even though the existence, content, and authority of the law of nature are known in virtue of the faculties possessed by all rational creatures—sense experience and reason—Locke recognizes that there are people who “refuse to be led by reason.” Because these people do not see the binding force of the law by their faculties alone, they need some other impetus to motivate their behavior. But, Locke thinks very ill of those who are in need of this other impetus. He says the these features of the law of nature can be discovered by anyone who is diligent about directing their mind to them, and can be concealed from no one “unless he loves blindness and darkness and casts off nature in order that he may avoid his duty” (Law, VI: 189, see also Government, II.6).

d. Reconciling the Law with Happiness

The main lines of Locke’s natural law theory are as follows: there is a moral law that is (1) discoverable by the combined work of reason and sense experience, and (2) binding on human beings in virtue of being decreed by God. Now, in §1 above, we saw that Locke thinks that all human beings are naturally oriented to the pursuit of happiness. This is because we are motivated to pursue things if they promise pleasure and to avoid things if they promise pain. It has seemed to many commentators that these two discussions of moral principles are in tension with each other. On the view described in Law, Locke straightforwardly appeals to reason and our ability to understand the nature of God’s attributes to ground our obligation to follow the law of nature. In other words, what is lawful ought to be followed because God wills it and what is unlawful ought to be rejected because it is not willed by God. Because we can straightforwardly see that God is the law-giver and that we are by nature subordinate to Him, we ought to follow the law. By contrast, in the discussion of happiness and pleasure in the Essay, Locke explains that good and evil reduce to what is pleasurable and what is painful. While he does also indicate that the special categories of good and evil—moral good and moral evil—are no more than the conformity or disagreement between our actions and a law, he immediately adds that such conformity or disagreement is followed by rewards or punishments that flow from the lawmaker’s will. From this discussion, then, it is difficult to see whether Locke holds that it is the reward and punishment that binds human beings to act in accordance with the law, or if it is the fact that the law is willed by God.

One way to approach this problem is to suggest that Locke changed his mind. Because of the thirty-year gap between Law and the Essay, we might be tempted to think that the more rationalist picture, where the law and its authority are based on reason, was the young Locke’s view when he wrote Law. This view, the story would go, was replaced by Locke’s more considered and mature view, hedonism. But this approach must be resisted because both theories are present in early and late works. The role of pleasure and pain with respect to morality is present not only in the Essay, but is invoked in Law (passage quoted at the end of §2c), and many other various minor essays written in the years between Law and Essay (for example, ‘Morality’ (c.1677–78) in Political Essays, 267–69). Likewise, the role of the authority of God’s will is retained after Law, again evident in various minor essays (for example, ‘Virtue B’ (1681) in Political Essays, 287-88), Government II.6), Locke’s correspondence (for example, to James Tyrrell, 4 August 1690, Correspondence, Vol.4, letter n.1309) and even in the Essay itself (II.xxviii.8). An answer to how we might reconcile these two positions is suggested when we consider the texts where appeals to both theories are found side-by-side in certain passages.

In his essay Of Ethick in General (c. 1686–88) Locke affirms the hedonist view that happiness and misery consist only in pleasure and pain, and that we all naturally seek happiness. But in the very next paragraph, he states that there is an important difference between moral and natural good and evil—the pleasure and pain that are consequences of virtuous and vicious behavior are grounded in the divine will. Locke notes that drinking to excess leads to pain in the form of headache or nausea. This is an example of a natural evil. By contrast, transgressing a law would not have any painful consequences if the law were not decreed by a superior lawmaker. He adds that it is impossible to motivate the actions of rational agents without the promise of pain or pleasure (Of Ethick in General, §8). From these considerations, Locke suggests that the proper foundation of morality, a foundation that will entail an obligation to moral principles, needs two things. First, we need the proof of a law, which presupposes the existence of a lawmaker who is superior to those to whom the law is decreed. The lawmaker has the right to ordain the law and the power to reward and punish. Second, it must be shown that the content of the law is discoverable to humankind (Of Ethick in General, §12). In this text it seems that Locke suggests that both the force and authority of the divine decree and the promise of reward and punishment are necessary for the proper foundation of an obligating moral law.

A similar line of argument is found in the Essay. There, Locke asserts that in order to judge moral success or failure, we need a rule by which to measure and judge action. Further, each rule of this sort has an “enforcement of Good and Evil.” This is because, according to Locke, “where-ever we suppose a Law, suppose also some Reward or Punishment annexed to that Law” (Essay, II.xxviii.6). Locke states that some promise of pleasure or pain is necessary in order to determine the will to pursue or avoid certain actions. Indeed, he puts the point even more strongly, saying that it would be in vain for the intelligent being who decrees the rule of law to so decree without entailing reward or punishment for the obedient or the unfaithful (see also Government, II.7). It seems, then, that reason discovers the fact that a divine law exists and that it derives from the divine will and, as such, is binding. We might think, as Stephen Darwall suggests in The British Moralists and the Internal Ought, that if reason is that which discovers our obligation to the law, the role for reward and punishment is to motivate our obedience to the law. While this succeeds in making room for both the rationalist and hedonist strains in Locke’s view, some other texts seem to indicate that by reason alone we ought to be motivated to follow moral laws.

One striking instance of this kind of suggestion is found in the third book of the Essay where Locke boldly states that “Morality is capable of Demonstration” in the same way as mathematics (Essay, III.xi.16). He explains that once we understand the existence and nature of God as a supreme being who is infinite in power, goodness, and wisdom and on whom we depend, and our own nature “as understanding, rational Beings,” we should be able to see that these two things together provide the foundation of both our duty and the appropriate rules of action. On Locke’s view, with focused attention the measures of right and wrong will become as clear to us as the propositions of mathematics (Essay, IV.iii.18). He gives two examples of such certain moral principles to make the point: (1) “Where there is no Property, there is no Injustice” and (2) “No Government allows absolute Liberty.” He explains that property implies a right to something and injustice is the violation of a right to something. So, if we clearly see the intensional definition of each term, we see that (1) is necessarily true. Similarly, government indicates the establishment of a society based on certain rules, and absolute liberty is the freedom from any and all rules. Again, if we understand the definitions of the two terms in the proposition, it becomes obvious that (2) is necessarily true. And, Locke states, following this logic, 1 and 2 are as certain as the proposition that “a Triangle has three Angles equal to two right ones” (Essay, IV.iii.18). If moral principles have the same status as mathematical principles, it is difficult to see why we would need further inducement to use these principles to guide our behavior. While there is no clear answer to this question, Locke does provide a way to understand the role of reward and punishment in our obligation to moral principles despite the fact that it seems that they ought to obligate by reason alone.

Early in the Essay, over the course of giving arguments against the existence of innate ideas, Locke addresses the possibility of innate moral principles. He begins by saying that for any proposed moral rule human beings can, with good reason, demand justification. This precludes the possibility of innate moral principles because, if they were innate, they would be self-evident and thus would not be candidates for justification. Next, Locke notes that despite the fact that there are no innate moral principles, there are certain principles that are undeniable, for example, that “men should keep their Compacts.” However, when asked why people follow this rule, different answers are given. A “Hobbist” will say that it is because the public requires it, and the “Leviathan” will punish those who disobey the law. A “Heathen” philosopher will say that it is because following such a law is a virtue, which is the highest perfection for human beings. But a Christian philosopher, the category to which Locke belongs, will say that it is because “God, who has the Power of eternal Life and Death, requires it of us” (Essay, I.iii.5). Locke builds on this statement in the following section when he notes that while the existence of God and the truth of our obedience to Him is made manifest by the light of reason, it is possible that there are people who accept the truth of moral principles, and follow them, without knowing or accepting the “true ground of Morality; which can only be the Will and Law of God” (Essay, I.iii.6). Here Locke is suggesting that we can accept a true moral law as binding and follow it as such, but for the wrong reasons. This means that while the Hobbist, the Heathen, and the Christian might all take the same law of keeping one’s compacts to be obligating, only the Christian does it for the right reason—that God’s will requires our obedience to that law. Indeed, Locke states that if we receive truths by revelation they too must be subject to reason, for to follow truths based on revelation alone is insufficient (see Essay, IV.xviii).

Now, to determine the role of pain and pleasure in this story, we turn to Locke’s discussion of the role of pain and pleasure in general. He says that God has joined pains and pleasures to our interaction with many things in our environment in order to alert us to things that are harmful or helpful to the preservation of our bodies (Essay, II.vii.4). But, beyond this, Locke notes that there is another reason that God has joined pleasure and pain to almost all our thoughts and sensations: so that we experience imperfections and dissatisfactions. He states that the kinds of pleasures that we experience in connection to finite things are ephemeral and not representative of complete happiness. This dissatisfaction coupled with the natural drive to obtain happiness opens the possibility of our being led to seek our pleasure in God, where we anticipate a more stable and, perhaps, permanent happiness. Appreciating this reason why pleasure and pain are annexed to most of our ideas will, according to Locke, lead the way to the ultimate aim of the enquiry in human understanding—the knowledge and veneration of God (Essay, II.vii.5–6). So, Locke seems to be suggesting here that pain and pleasure prompt us to find out about God, in whom complete and eternal happiness is possible. This search, in turn, leads us to knowledge of God, which will include the knowledge that He ought to be obeyed in virtue of His decrees alone. Pleasure and pain, reward and punishment, on this interpretation, are the means by which we are led to know God’s nature, which, once known, motivates obedience to His laws. This mechanism supports Locke’s claim that real happiness is to be found in the perfection of our intellectual nature—in embarking on the search for knowledge of God, we embark on the intellectual journey that will lead to the kind of knowledge that brings permanent pleasure. This at least suggests that the knowledge of God has the happy double-effect of leading to both more stable happiness and the understanding that God is to be obeyed in virtue of His divine will alone.

But given that all human beings experience pain and pleasure, Locke needs to explain how it is that certain people are virtuous, having followed the experience of dissatisfaction to arrive at the knowledge of God, and other people are vicious, who seek pleasure and avoid pain for no reason other than their own hedonic sensations.

4. Power, Freedom, and Suspending Desire

a. Passive and Active Powers

In any discussion of ethics, it is important not only to determine what, exactly, counts as virtuous and vicious behavior, but also the extent to which we are in control of our actions. This is important because we want to be able to adequately connect behavior to agents in order to attribute praise or blame, reward or punishment to an agent, we need to be able to see the way in which she is the causal source of her own actions. Locke addresses this issue in one of the longest chapters of the Essay—“Of Power.” In this chapter, Locke describes how he understands the nature of power, the human will, freedom and its connection to happiness, and, finally, the reasons why many (or even most) people do not exercise their freedom in the right kind of way and are unhappy as a result. It is worth noting here that this chapter of the Essay underwent major revisions throughout the five editions of the Essay and in particular between the first and second edition. The present discussion is based on the fourth edition of the Essay (but see the “References and Further Reading” below for articles that discuss the relevance of the changes throughout all five editions).

Locke states that we come to have the idea of “power” by observing the fact that things change over time. Finite objects are changed as a result of interactions with other finite objects (for example fire melts gold) and we notice that our own ideas change either as a result of external stimulus (for example the noise of a jackhammer interrupts the contemplation of a logic problem) or as a result of our own desires (for example hunger interrupts the contemplation of a logic problem). The idea of power always includes some kind of relation to action or change. The passive side of power entails the ability to be changed and the active side of power entails the ability to make change. Our observation of almost all sensible things furnishes us with the idea of passive power. This is because sensible things appear to be in almost constant flux—they are changed by their interaction with other sensible things, with heat, cold, rain, and time. And, Locke adds, such observations give us no fewer instances of the idea of active power, for “whatever Change is observed, the Mind must collect a Power somewhere, able to make that Change” (Essay, II.xxi.4). However, when it comes to active powers, Locke states that the clearest and most distinct idea of active power comes to us from the observation of the operations of our own minds. He elaborates by stating that there are two kinds of activities with which we are familiar: thinking and motion. When we consider body in general, Locke states that it is obvious that we receive no idea of thinking, which only comes from a contemplation of the operations of our own minds. But neither does body provide the idea of the beginning of motion, only of the continuation or transfer of motion. The idea of the beginning of motion, which is the idea associated with the active power of motion, only comes to us when we reflect “on what passes in our selves, where we find by Experience, that barely by willing it, barely by a thought of the Mind, we can move the parts of our Bodies, which were before at rest” (Essay, II.xxi.4). So, it seems, the operation of our minds, in particular the connection between one kind of thought, willing, and a change in either the content of our minds or the orientation of our bodies, provides us with the idea of an active power.

b. The Will

The power to stop, start, or continue an action of the mind or of the body is what Locke calls the will. When the power of the will is exercised, a volition (or willing) occurs. Any action (or forbearance of action) that follows volition is considered voluntary. The power of the will is coupled with the power of the understanding. This latter power is defined as the power of perceiving ideas and their agreement or disagreement with one another. The understanding, then, provides ideas to the mind and the will, depending on the content of these ideas, prefers certain courses of action to others. Locke explains that the will directs action according to its preference—and here we must understand “preference” in the most general sense of inclination, partiality, or taste. In short, the will is attracted to actions that promise the procurement of pleasing things and/or the distancing from displeasing things. The technical term that Locke uses to describe that which determines the will is uneasiness. He elaborates, stating that the reason why any action is continued is “the present satisfaction in it” and the reason why any action is taken to move to a new state is dissatisfaction (Essay, II.xxi.29). Indeed, Locke affirms that uneasiness, at bottom, is really no more than desire, where the mind is disturbed by a “want of some absent good” (Essay, II.xxi.31). So, any pain or discomfort of the mind or body is a motive for the will to command a change of state so as to move from unease to ease. Locke notes that it is a common fact of life that we often experience multiple uneasinesses at one time, all pressing on us and demanding relief. But, he says, when we ask the question of what determines the will at any one moment, the answer is the most pressing uneasiness (Essay, II.xxi.31). Imagine a situation where you are simultaneously experiencing discomfort as a result of hunger and the anxiety of being under-prepared for tomorrow’s philosophy exam. On Locke’s view the most intense or the most pressing of these uneasinesses will determine your will to command the action that will relieve it. This means that no matter how much you want to stay at the library to study, if hunger comes to be the more pressing than the desire to pass the exam, hunger will determine the will to act, commanding the action that will result in the procurement of food.

While Locke states that the most pressing uneasiness determines the will, he adds that it does so “for the most part, but not always.” This is because he takes the mind to have the power to “suspend the execution and satisfaction of any of its desires” (Essay, II.xxi.47). While a desire is suspended, Locke says, our mind, being temporarily freed from the discomfort of the want for the thing desired, has the opportunity to consider the relative worth of that thing. The idea here is that with appropriate deliberation about the value of the desired thing we will come to see which things are really worth pursuing and which are better left alone. And, Locke states, the conclusion at which we arrive after this intellectual endeavor of consideration and examination will indicate what, exactly, we take to be part of our happiness. And, in turn, by a mechanism that Locke does not describe in any detail, our uneasiness and desire for that thing will change to reflect whether we concluded that the thing does, indeed, play a role in our happiness or not (Essay, II.xxi.56). The problem is that there is no clear explanation for how, exactly, the power to suspend works. Despite this, Locke nowhere indicates that suspension is an action of the mind that is determined by anything other than volition of the will. We know that Locke takes all acts of the will to be determined by uneasiness. So, suspending our desires must be the result of uneasiness for something. Investigating how Locke understands human freedom and judgment will allow us to see what, exactly, we are uneasy for when we are determined to suspend our desires.

c. Freedom

When the nature of the human will is under discussion, we often want to know the extent of this faculty’s freedom. The reason why this question is important is because we want to see how autonomously the will can act. Typically, the question takes the form of: is the will free? Locke unequivocally denies that the will is free, implying, in fact, that it is a category mistake to ask the question at all. This is because, on his view, both the will and freedom are powers of agents, and it is a mistake to think that one power (the will) can have as a property a second power (freedom) (Essay, II.xxi.20). Instead, Locke thinks that the right question to pose is whether the agent is free. He defines freedom in the following way:

[T]he Idea of Liberty, is the Idea of a Power in any Agent to do or forbear any particular Action, according to the determination or thought of the mind, whereby either of them is preferr’d to the other; where either of them is not in the Power of the Agent to be produced by him according to his Volition here he is not a Liberty, that Agent is under Necessity. (Essay, II.xxi.8)

So, Locke considers that an agent is free in acting when her action is connected to her volition in the right kind of way. That is, when her action (or forbearance of action) follows from her volition, she is free. And, her volition is determined by the “thought of the mind” that indicates which action is preferred.

Notice here that Locke takes an agent to be free in acting when she acts according to her preference—this means that her actions are determined by her preference. This plainly shows that Locke does not endorse a kind of freedom of indifference, according to which the will can choose to command an action other than the thing most preferred at a given moment. This is the kind of freedom most often associated with indeterminism. Freedom, then, for Locke, is no more than the ability to execute the action that is taken to result in the most pleasure at a given moment. The problem with this way of defining freedom is that it seems unable to account for the kinds of actions we typically take to be emblematic of virtuous or vicious behavior. This is because we tend to think that the power of freedom is a power that allows us to avoid vicious actions, perhaps especially those that are pleasurable, in order to pursue a righteous path instead. For instance, on the traditional Christian picture, when we wonder about why God would allow Adam to sin, the response given is that Adam was created as a free being. While God could have created beings that, like automata, unfailingly followed the good and the true, He saw that it was all things considered better to create beings that were free to choose their own actions. This decision was made despite the fact that God foresaw the sinful use to which this freedom would be put. This traditional view explains Adam’s sin in the following way: Adam knew that it was God’s commandment that he was not to eat of the tree of knowledge. Adam also knew that following God’s commandment was the right thing to do. So, in the moment where he was tempted to eat the fruit of the tree of knowledge, he knew it was the wrong thing to do, but did it anyway. This is because, the story goes, and in that moment he was free to decide whether to follow the commandment or to give in to temptation. Of his own free choice, Adam decided to follow temptation. This means that in the moment of original sin, both following God’s commandment and eating the fruit were live options for Adam, and he chose the fruit of his own agency.

Now, on Locke’s system, a different explanation obtains. Given his definition of freedom, it is difficult, at least prima facie, to see how Adam could be blamed for choosing the fruit over the commandment. For, according to Locke, an agent acts freely when her actions are determined by her volitions. So, if Adam’s greatest uneasiness was for the fruit, and the act of eating the fruit was the result of his will commanding such action based on his preference, then he acted freely. But, on this understanding of freedom, it is difficult to see how, exactly, Adam can be morally blamed for eating the fruit. The question now becomes: is Adam to be blamed for anticipating more pleasure from the consumption of the fruit than from following God’s command? In other words, was it possible for Adam to alter the intensity of his desire for the fruit? It seems that on Locke’s view, the answer must be connected to one of the powers he takes human beings to possess—the power to suspend desires. And, in certain passages of the Essay, Locke implies that suspending desires and freedom are linked, suggesting that while agents are acting freely whenever their volitions and actions are linked in the right kind of way, there is, perhaps, a proper use of the power to act freely.

d. Judgment

Locke asserts that the “highest perfection of intellectual nature” is the “pursuit of true and solid happiness.” He adds that taking care not to mistake imaginary happiness for real happiness is “the necessary foundation of our liberty.” And, he writes that the more closely we are focused on the pursuit of true happiness, which is our greatest good, the less our wills are determined to command actions to pursue lesser goods that are not representative of the true good (Essay, II.xxi.51). In other words, the more we are determined by true happiness, the more we will to suspend our desires for lesser things. This suggests that Locke takes there to be a right way to use our power of freedom. Locke indicates that there are instances where it is impossible to resist a particular desire—when a violent passion strikes, for instance. He also states, however, that aside from these kinds of violent passions, we are always able to suspend our desire for any thing in order to give ourselves the time and the emotional distance from the thing desired in which to consider the worth of thing relative to our general goal: true happiness. True happiness, or real bliss, on Locke’s view, is to be found in the pursuit of things that are true intrinsic goods, which promise “exquisite and endless Happiness” in the next life (Essay, II.xxi.70). In other words, true good is something like the Beatific Vision.

Now, Locke admits that it is a common experience to be carried by our wills towards things that we know do not play a role in our overall and true happiness. However, while he allows that the pursuit of things that promise pleasure, even if only a temporary pleasure, represents the action of a free agent, he also says that it is possible for us to be “at Liberty in respect of willing” when we choose “a remote Good as an end to be pursued” (Essay, II.xxi.56). The central thing to note here is that Locke is drawing a distinction between immediate and remote goods. The difference between these two kinds of goods is temporal. For instance, acting to obtain the pleasure of intoxication is to pursue an immediate good while acting to obtain the pleasure of health is to pursue a remote good. So, we can suppose here that Locke is suggesting that forgoing immediate goods and privileging remote goods is characteristic of the right use of liberty (but see Rickless for an alternative interpretation). If this is so, it is certainly not a difficult suggestion to accept. Indeed, it is fairly straightforwardly clear that many immediate pleasures do not, in the end, contribute to overall and long-lasting happiness.

The question now, and it is a question that Locke himself poses, is “How Men come often to prefer the worse to the better; and to chase that, which, by their own Confession, has made them miserable” (Essay, II.xxi.56). Locke gives two answers. First, bad luck can account for people not pursuing their true happiness. For instance, someone who is afflicted with an illness, injury, or tragedy is consumed by her pain and is thus unable to adequately focus on remote pleasures. Quoting Locke’s second answer “Other uneasinesses arise from our desire of absent good; which desires always bear proportion to, and depend on the judgment we make, and the relish we have of any absent good; in both which we are apt to be variously misled, and that by our own fault” (Essay, II.xxi.57).

Here Locke states that our own faulty judgment is to blame for our preferring the worse to the better. This is because, on his view, the uneasiness we have for any given object is directly proportional to the judgments we make about the merit of the things to which we are attracted. So, if we are most uneasy for immediate pleasures, it is our own fault because we have judged these things to be best for us. In this way, Locke makes room in his system for praiseworthiness and blameworthiness with respect to our desires: absent illness, injury, or tragedy, we ourselves are responsible for endorsing, through judgment, our uneasinesses. He continues, stating that the major reason why we often misjudge the value of things for our true happiness is that our current state fools us into thinking that we are, in fact, truly happy. Because it is difficult for us to consider the state of true, eternal happiness, we tend to think that in those moments when we enjoy pleasure and feel no uneasiness, we are truly happy. But such thoughts are mistaken on his view. Indeed, as Locke says, the greatest reason why so few people are moved to pursue the greatest, remote good is that most people are convinced that they can be truly happy without it.

The cause of our mistaken judgments is the fact that it is very difficult for us to compare present and immediate pleasures and pains with future or remote pleasures and pains. In fact, Locke likens this difficulty to the trouble we typically experience in correctly estimating the size of distant objects. When objects are close to us, it is easy to determine their size. When they are far away, it is much more difficult. Likewise, he says, for pleasures and pains. He notes that if every sip of alcohol were accompanied by headache and nausea, no one would ever drink. But, “the fallacy of a little difference in time” provides the space for us to mistakenly judge that the alcohol contributes to our true happiness (Essay, II.xxi.63). We experience this difficulty of judging remote pleasures and pains due to the “weak and narrow Constitution of our Minds” (Essay, II.xxi.64). The condition of our minds makes it easy for us to think that there could be no greater good than the relief of being unburdened of a present pain. In order to correct this problem and convince a man to judge that his greatest good is to be found in a remote thing, Locke says that all we must do is convince him that “Virtue and Religion are necessary to his Happiness” (Essay, II.xxi.60). Locke explains that a “due consideration will do it in most cases; and practice, application, and custom in most” (Essay, II.xxi.69). The suggestion is that contemplation and deliberation alone may be sufficient to correct our problem of considering all immediate pleasures and pains to be greater than any future ones. And, if that does not work, practice and habit can also correct this problem. By practice and exposure, we can, according to Locke, change the agreeableness or disagreeableness of things. It seems, then, that the power to suspend desire must be the power to reject immediate pleasures in favor of the pursuit of remote or future pleasures. However, it seems that in order to suspend in this way, we must already have judged that these immediate pleasures are not representative of the true good. For, without this kind of prior judgment, it seems that we would not be in a position to suspend in the way that is required. This is because absent the prior judgment, there would be no reason for the uneasiness we felt for the perceived good to not determine the will. The question to resolve now is how to get ourselves into a position where we are uneasy for the remote, true good and can suspend our desires for immediate pleasures. In other words, we must determine how we can come to seriously judge immediate pleasures to not have a part in our true happiness.

5. Living the Moral Life

In order to behave in a way that will lead us to the greatest and truest happiness, we must come to judge the remote and future good, the “unspeakable,” “infinite,” and “eternal” joys of heaven to be our greatest and thus most pleasurable good (Essay, II.xxi.37–38). But, on Locke’s view, our actions are always determined by the thing we are most uneasy about at any given moment. So, it seems, we need to cultivate the uneasiness for the infinite joys of heaven. But if, as Locke suggests, the human condition is such that our minds, in their weak and narrow states, judge immediate pleasures to be representative of the greatest good, it is difficult to see how, exactly, we can circumvent this weakened state in order to suspend our more terrestrial desires and thus have the space to correctly judge which things will lead to our true happiness. While in the Essay Locke does not say as much as we might like on this topic, elsewhere in his writings we can get a sense for how he might respond to this question.

In 1684, Locke was asked by his friend Edward Clarke, for advice about raising and educating his children. In 1693, Locke’s musings on this topic were published as Some Thoughts Concerning Education (hereafter: Education). This text provides insight into the importance that Locke places on the connection between the pursuit of true happiness and early childhood education in general. Locke begins his discussion by noting that happiness is crucially dependent on the existence of both a sound mind and a sound body. He adds that it sometimes happens that by a great stroke of luck, someone is born whose constitution is so strong that they do not need help from others to direct their minds towards the things that will make them happy. But this is an extraordinarily rare occurrence. Indeed, Locke notes: “I think I may say, that, of all the men we meet with, nine parts of ten are what they are, good or evil, useful or not, by their education” (Education, §1). It is the education we receive as young children, on Locke’s view, that determines how adept we are at targeting the right objects in order to secure our happiness. He observes that the minds of young children are easily distracted by all kinds of sensory stimuli and notes that the first step to developing a mind that is focused on the right kind of things is to ensure that the body is healthy. Indeed, the objective in physical health is to get the body in the perfect state to be able to obey and carry out the mind’s commands. The more difficult part of this equation is training the mind to “be disposed to consent to nothing, but what may be suitable to the dignity and excellency of a rational creature” (Education, §31). And Locke goes further still, stating that the foundation of all virtue is to be placed in the ability of a human being to “deny himself his own desires, cross his own inclinations, and purely follow what reason directs as best, though the appetite lean the other way” (Education, §33). The way to do this, he says, is to resist immediately present pleasures and pains and to wait to act until reason has determined the value of the desirable things in one’s environment.

Locke states that we must recognize the difference between “natural wants” and “wants of fancy.” The former are the kinds of desires that must be obeyed and that no amount of reasoning will allow us to give up. The latter, however, are created. Locke states that parents and teachers must ensure that children develop the habit of resisting any kind of created fancy, thus keeping the mind free from desires for things that do not lead to true happiness (Education, §107). If parents and teachers are successful in blocking the development of “wants of fancy,” Locke thinks that the children who benefit from this success will become adults who will be “allowed greater liberty” because they will be more closely connected to the dictates of reason and not the dictates of passion (Education, §108). So, in order to live the moral life and listen to reason over passions, it seems that we need to have had the benefit of conscientious care-givers in our infancy and youth (see also Government, II.63). This raises the difficulty of how to connect an individual’s moral successes or failures with the individual herself. For, if she had the bad moral luck of unthinking or careless parents and teachers, it seems difficult to see how she could be blamed for failing to follow a virtuous path.

One way of approaching this difficulty is to recall that Locke takes the content of law of nature, the moral law decreed by God, to be the preservation both of ourselves and of the other people in our communities in order to glorify God (Law, IV). The dictate to help to preserve the other people in our community shifts some of the moral burden from the individual onto the community. This means that it is every individual’s responsibility to do all they can, all things considered, to preserve themselves and to ensure, to the best of their ability, that the children in their communities are raised to avoid developing wants of fancy. In this way, children will develop the habit of suspending their desires for terrestrial pleasures and focusing their efforts on attaining the true happiness that results from acting to secure remote goods.

6. References and Further Reading

a. Primary Sources

An Essay Concerning Human Understanding. Edited by Peter H. Nidditch. Oxford: Clarendon Press, 1975.
- This is the critical edition of Locke’s Essay. The body of the text is based on the fourth edition of the Essay and all the changes from the first edition through the fifth (1689, 1694, 1695, 1700, 1706) are indicated in the footnotes. The text also includes a comprehensive forward by Nidditch. Note that Locke’s orthography, grammar, and style are often quite different from the way that academic English is written today. In the citations from this text in particular, all emphases, capitalization, and odd spelling are original to Locke.
Essays on the Laws of Nature. Edited and translated by W. von Leyden. Oxford: Clarendon Press, 1954.
- This edition includes both the original Latin and the English translation of the essays. It also includes Locke’s valedictory speech as censor of moral philosophy at Christ Church and some other shorter pieces of writing. Von Leyden’s introduction provides a very detailed discussion of the sources of Locke’s arguments in these essays, the arguments themselves, and the relations these arguments bear to other of Locke’s writings. It is worth noting here that on von Leyden’s interpretation, it is not possible to render Locke’s discussion of natural law consistent with his endorsement of a hedonistic motivational system in later works.
Political Essays. Edited by Mark Goldie. Cambridge: Cambridge University Press, 1997.
- This collection includes major writings on politics and government, including Essays on the Laws of Nature, Of Ethick in General, and An Essay on Toleration, in addition to many other minor essays.
The Correspondence of John Locke, in Eight Volumes. Edited by E.S. De Beer. Oxford: Clarendon Press, 1976–89.
- A complete database of Locke’s correspondence including notes about his correspondents, notes about events and proper names mentioned in letters, as well as signposts for what was going on in Locke’s life at the time he was writing. The first volume of the collection includes an exhaustive introduction to Locke’s life, work, and contacts in the academic and social world; an explanation of how Locke’s letters were preserved; a discussion of previous publications of Locke’s correspondence and how they relate to this collection; and information about transcription practices, including details about editorial grammar decision and dating of the letters.
The Works of John Locke, in Nine Volumes, 12th edition. London: Rivington, 1824.
- This collection includes most of Locke’s longer texts, some shorter texts and a selection of letters. Among other things, the collection contains: Essay (vols.1 and 2), his correspondence with Stillingfleet (vol.3), Two Treatises of Government (vol.4), Letters on Toleration (vol.5), The Reasonableness of Christianity (vol.6), notes on St. Paul’s Epistles (vol.7), Some Thoughts Concerning Education and A Discourse of Miracles (vol.8), and a selection of letters (vol.9).

b. Secondary Sources: Books

Aaron, Richard I. John Locke. Oxford: Oxford University Press, 1971.
- This is a comprehensive study of Locke’s life and works and includes fifteen very nice pages on Locke’s moral philosophy. Importantly, Aaron concludes that Locke fails to provide his readers with a science of morals and, in fact, that Locke’s disparate comments about ethics and moral principles cannot be reconciled.
Colman, John. John Locke’s Moral Philosophy. Edinburgh: Edinburgh University Press, 1983.
- In this study, Colman addresses the major themes and problems of Locke’s moral theory including the connection between law and obligation, and the connection between moral principles and demonstrability.
Darwall, Stephen. The British Moralists and the Internal ‘Ought’: 1640–1740. Cambridge: Cambridge University Press, 1995.
- This is a deep and broad study of moral philosophy from the mid 17th to the mid 18th century. Locke is one among several central figures under discussion. The reader greatly benefits from Darwall’s careful discussions of the theoretical connections between Locke and his contemporaries and his influences on the topics of natural law, autonomy, motivation, duty, and freedom.
Lolordo, Antonia. Locke’s Moral Man. Oxford: Oxford University Press, 2012.
- In this study, Lolordo draws on different parts of the Essay in order to see Locke’s theory of agency. She argues in favor of the interpretation according to which there are two senses of freedom in Locke’s view, one of which is properly used to attain the goal proper to a moral agent. Of particular interest is her discussion that links Locke’s comments about personal identity to moral agency and her claim that, for Locke, metaphysics is unnecessary for ethics.
Mabbot, J.D. John Locke. London: Macmillan Press, 1973.
- This is a study of Locke’s philosophical system that focuses on knowledge acquisition, logic and language, ethics and theology, and political theory. In his discussion of ethics and theology, Mabbot traces Locke’s discussions of moral principles, their demonstrability, and their binding force through The Two Treatises of Government, The Essays on the Laws of Nature, and An Essay Concerning Human Understanding.
Schouls, Peter A. Reasoned Freedom: John Locke and Enlightenment. Ithaca: Cornell University Press, 1992.
- This is a defense of the view that Locke was a great influence on enlightenment thought, in particular in the domains of reason and freedom. Schouls also points out what he takes to be many inconsistencies across and sometimes within Locke’s texts.
Yaffe, Gideon. Liberty Worth the Name: Locke on Free Agency. New Jersey: Princeton University Press, 2000.
- This is a book-length study of Locke’s view of human freedom. The content includes careful analysis of the chapter ‘Of Power’ of the Essay in addition to comments about how this chapter is connected to Locke’s discussion of personal identity. Yaffe defends an interpretation according to which Locke’s view contains two definitions of freedom, only one of which is “worth the name”—the kind of freedom that allows the pursuit of true good.

c. Secondary Sources: Articles

Chappell, Vere. “Locke on the Intellectual Basis of Sin.” Journal of the History of Philosophy 32 (1994): 197–207.
Chappell, Vere. “Locke on the Liberty of the Will.” In Locke’s Philosophy: Content and Context. Edited by G.A.J. Rogers, 101–21. Oxford: Oxford University Press, 1994.
Chappell, Vere. “Power in Locke’s Essay.” In The Cambridge Companion to Locke’s “An Essay Concerning Human Understanding.” Edited by Lex Newman, 130–56. Cambridge: Cambridge University Press, 2007.
- In these articles, Chappell advances the interpretation that changes made in the fifth edition of the Essay indicate that Locke changed his view about human freedom.
Darwall, Stephen. “The Foundations of Morality,” In The Cambridge Companion to Early Modern Philosophy. Edited by Donald Rutherford, 221–49.
- This paper canvasses the main themes explored by and influences on early modern moral theories, including Locke’s.
Glauser, Richard. “Thinking and Willing in Locke’s Theory of Human Freedom,” Dialogue 42 (2003): 695–724.
- Glauser argues that Locke’s view remains consistent across the changes made in the various editions of the Essay.
Magri, Tito. “Locke, Suspension of Desire, and the Remote Good,” British Journal for the History of Philosophy 8 (2000): 55–70.
- Magri argues that Locke’s view changes over the course of the different editions of the Essay, in particular that he moves from having an “internalist” view of motivation to having an “externalist” view of motivation. Magri casts doubt on the consistency of Locke’s position.
Mathewson, Mark D. “John Locke and the Problems of Moral Knowledge,” Pacific Philosophical Quarterly 87 (2006): 509–26.
- Mathewson argues that Locke’s comments about the nature of moral ideas leads to moral subjectivity and relativism.
Rickless, Samuel. “Locke on Active Power, Freedom, and Moral Agency,” Locke Studies 13 (2013): 31–51.
Rickless, Samuel. “Locke on the Freedom to Will.” Locke Newsletter 31 (2000): 43–68.
- In these papers, Rickless argues that Locke holds one and only one definition of freedom: the ability to act according to our volitions. According to Rickless, Locke holds the same definition of freedom as Hobbes. The 2013 paper is a direct argument against the interpretation advanced by Lolordo in Locke’s Moral Man.
Schneewind, J.B. “Locke’s Moral Philosophy,” The Cambridge Companion to Locke. Edited by Vere Chappell. Cambridge: Cambridge University Press, 1994.
- Schneewind is one commentator who thinks that Locke’s moral philosophy ends up in a contradiction between the natural law view and hedonism.
Walsh, Julie. “Locke and the Power to Suspend Desire,” Locke Studies, 14 (2014).
- Walsh argues that Locke’s view remains consistent and coherent across the various editions of the Essay and emphasizes the role played by suspension and judgment in attaining true happiness.

Author Information

Julie Walsh
Email: julie.walsh@wellesley.edu
Wellesley College
U. S. A.

Truthmaker Theory

Truthmaker theory is the branch of metaphysics that explores the relationships between what is true and what exists. Discussions of truthmakers and truthmaking typically start with the idea that truth depends on being, and not vice versa. For example, if the sentence ‘Kangaroos live in Australia’ is true, then there are kangaroos living in Australia. And if there are kangaroos living in Australia, then the sentence ‘Kangaroos live in Australia’ is true. But we can ask whether the sentence is true because of the way the world is, or whether the world is the way it is because the sentence is true. Truthmaker theorists make the former claim that the sentence is true because of what exists in the world; it is not the case that the world is the way it is because of which sentences are true. Truthmaker theorists use this fundamental idea as a starting point for clarifying the nature of truth and its relationship to ontology, and to advance various views in metaphysics concerning the nature of the past and future, counterfactual conditionals, modality, and many others. Because truthmaker theorists end up with differing views concerning all these matters, what ultimately unites them is not any single thesis but rather a commitment to thinking that the idea of truthmaking is a useful one for pursuing metaphysical inquiry. Others might conceive of ‘truthmaker theory’ more strictly (such as by requiring a commitment to all truths having truthmakers, or all truthmakers being of a particular ontological variety), though defining the enterprise in this way will inevitably fail to capture all those earnestly pursuing investigation into truthmaking.

Philosophical discussion of truthmakers falls into two broad categories. First, there are ‘internal’ debates about the nature of truthmaker theory itself. For instance, there are open questions as to which truths have truthmakers: do all truths have truthmakers, or just some proper subset of truths (such as the positive truths or synthetic truths)? There are questions as to the nature of the truthmaking relation: is it a necessary relation or a contingent one? Is it a kind of supervenience, dependence, or something else? And it is an open question as to what sorts of objects serve as truthmakers: perhaps there are states of affairs, tropes, or counterparts that serve as truthmakers, or perhaps none of these. There is also frequent debate as to whether truthmaker theory constitutes a theory of truth (similar to, in particular, the correspondence theory of truth), or whether it is an entirely separate philosophical enterprise, one concerned more with metaphysics rather than semantics.

There are also ‘external’ truthmaking discussions that apply basic ideas about truthmaking to longstanding metaphysical topics. The hope is that truthmaker theory can bring new insights and argumentative resources to bear on traditional metaphysical inquiries. For example, truthmaker theorists investigate whether presentism—the view that only the present exists—can satisfy the obligations of truthmaker theory. Truthmaker theory has also been wielded against metaphysical views such as behaviorism and phenomenalism, and it has made contributions to the metaphysics of modality.

History of Truthmaker Theory
The Truthmaking Relation
Maximalism and Non-Maximalism
Kinds of Truthmakers
Truthmaking Principles
Truthmaking and Truth
Truthmaking and the Past
Truthmaking and Modality
Objections to Truthmaker Theory
References and Further Reading

1. History of Truthmaker Theory

Perhaps the first occurrence of a basic truthmaking idea is found in Aristotle’s Categories. There Aristotle points out that if a certain man exists, then a statement that that man exists is true, and vice versa. But it seems that there is a difference in priority between these two states of affairs. The statement is true because the man exists; it is not the case that the man exists because the statement is true. Aristotle is, in effect, raising a ‘Euthyphro’ question, drawing on Plato’s famous dialogue. Is the statement true because of the way the world is, or is the world the way it is because of which statements are true? Aristotle chose the former answer, and set the stage for discussions of truthmakers far down the road.

The idea of a truthmaker did not play a significant role in philosophy until the rise of logical atomism in the work of Bertrand Russell and Ludwig Wittgenstein. In the Philosophy of Logical Atomism, Russell takes it to be a truism that there are facts, and says that facts are the sort of thing that make propositions true or false. The project of logical atomism is then to determine what sorts of facts are ontologically required in order to make true all the different kinds of propositions. The most basic kind of fact for Russell is the atomic fact, which consists of no more than the possession of a quality by a particular object (or of the holding of a relation between multiple objects). Sentences like ‘X is green’ and ‘X is heavier than Y’, if true, are made true by atomic facts. More complex sentences like ‘X is green and is heavier than Y’ do not call for more complex, ‘molecular’ facts. Instead, the same atomic facts from before can explain the truth of conjunctive sentences. Particularly worrisome are negative truths, such as ‘X is not red’. Russell believed that there need to be negative facts to account for negative truths. In advocating for the existence of negative facts, Russell claims to have ‘nearly produced a riot’ when he suggested the idea at a seminar at Harvard (1985: 74). The idea that reality contains entities that are fundamentally negative in nature has long struck many philosophers as puzzling and metaphysically unacceptable, and there has been continuing controversy over what, if anything, makes negative truths true.

The next major advance in truthmaker theory came from the work of the Australian philosopher David Armstrong. Armstrong—who credits fellow philosopher Charlie Martin with inspiring him on the topic—has long advocated the use of truthmakers in metaphysics. Armstrong cites two paradigm examples of how truthmakers can be put to work in philosophy. First, there is the case of behaviorism, as defended by Gilbert Ryle (1949). Ryle’s philosophy of mind relies heavily on dispositions; Ryle thought that claims involving mental terms could be analyzed into subjunctive conditionals involving dispositions. What it is for Ryle to believe that he is a philosopher is that if he were to be asked what his profession was, he would reply ‘philosopher’. While this counterfactual may be true, the truthmaker theorist asks: but what is it that makes it true? The behaviorist faces a challenge of either accepting this counterfactual as a brute truth, a truth with no further explanation, or admitting that it is made true by some sort of mental state, thus abandoning the supposed ontological economy of behaviorism.

Similarly, Armstrong argues that the phenomenalism of philosophers such as Berkeley and Mill faces a parallel difficulty. According to phenomenalism, all that exists are sensory impressions. But might it not be true that there is a rock on the dark side of the moon that no one has ever observed? The phenomenalist accounts for this idea by claiming that if you were to go to that part of the moon, you would have a rock-like sensory impression. But again: what makes that counterfactual true? The anti-phenomenalist will say that the counterfactual is true because it is made true (at least in part) by the rock itself. The phenomenalist, limited by an ontology of actual sense impressions, is hard-pressed to find a plausible answer to the truthmaker theorist’s question.

In the wake of Armstrong’s (and others’) writings, truthmaker theory became a lively corner of contemporary metaphysics.

2. The Truthmaking Relation

A key concern of truthmaker theory is giving an account of the truthmaking relation. When some object X is a truthmaker for some truth Y, what is the nature of the relationship that X and Y stand in?

One universally agreed upon fact about the truthmaking relation is that it is not a one-one relation. That is, in principle an object can be a truthmaker for multiple truths, and any given truth can have multiple truthmakers. For example, Socrates is frequently thought to be a truthmaker not only for ‘Socrates exists’, but also for ‘Socrates is human’ and ‘There are humans’. For it is impossible that Socrates—who is essentially human—could exist and yet any of those sentences be false (at least given some familiar assumptions about essences). Similarly, ‘There are humans’ is made true by many things—anything that is essentially human, in fact. Hence, it can be misleading to ask what the truthmaker for some truth is, since it is not necessary that truths have only one, unique truthmaker.

So what exactly is the nature of the relation? To ask this question is to probe what sort of analysis, if any, can be given of the truthmaking relation. Many truthmaker theorists have argued that the truthmaking relation, at the least, requires metaphysical necessitation. Some object X necessitates the truth of Y if and only if it is metaphysically impossible for X to exist, and yet Y not be true. In the language of possible worlds, X necessitates Y if and only if every possible world in which X exists is a world in which Y is true. Necessitation is thought to be a necessary component of the truthmaking relation because it shows that the truthmaker’s existence is a sufficient condition on the truth in question. If X’s existence were not enough to guarantee Y’s truth, then X would not yet adequately explain or account for the truth of Y. Something else, in addition to X, would be needed to properly account for Y’s truth.

Not all theorists have agreed that necessitation is necessary for truthmaking. Hugh Mellor (2003), for instance, at one point argued that truthmakers need not necessitate the truths that they make true. Mellor relied on the controversial case of general truths, such as ‘All gold spheres are less than a mile in diameter’. Suppose there are three such spheres, A, B, and C. Then there are three states of affairs (Mellor calls them ‘facta’): A’s being less than a mile in diameter, B’s being less than a mile in diameter, and C’s being less than a mile in diameter. For Mellor, the truthmaker for the general truth is no more than the sum of the three states of affairs. But these three states of affairs do not necessitate the truth of ‘All gold spheres are less than a mile in diameter’, since it is possible that that very sum could exist, and yet the sentence be false. That is a case where, for example, A, B, and C all exist with diameters less than a mile, but a fourth gold sphere D exists whose diameter is greater than a mile. Mellor reasons that the sum of the three states of affairs is the truthmaker for ‘All gold spheres are less than a mile in diameter’, and thus concludes that truthmaking does not require necessitation. (Furthermore, on his view, the truthmaking relation is contingent in the sense that whether X is a truthmaker for Y can vary from world to world. Those who accept necessitation would reject this consequence.) Other theorists argue that truthmaking does require necessitation, and so the sum is not a truthmaker for the sentence; something else (such as one of the totality states of affairs discussed below) is needed to provide a truthmaker, or perhaps it has no truthmaker at all (according to advocates of the supervenience accounts discussed below).

It is more common for philosophers to challenge the sufficiency of the necessitation condition, rather than its necessity. The concern that necessitation is not enough derives in large part from the fact that all objects necessitate the truth of all necessary truths. This is the problem of trivial truthmakers for necessary truths. For example, Socrates necessitates ‘2 + 2 = 4’, for it is metaphysically impossible for Socrates to exist and yet ‘2 + 2 = 4’ be false. Similarly, if God exists, and exists necessarily, then a torn, dog-eared copy of Lolita rotting away in some landfill necessitates the truth of ‘God exists’. If it is impossible for that sentence to be false, then it is impossible for that sentence to be false should that rotting copy of Lolita exist. But—according to this line of thought—Socrates is not a truthmaker for ‘2 + 2 = 4’, and the copy of Lolita is not a truthmaker for ‘God exists’. Truthmaking requires more than just necessitation.

Theories divide as to what exactly else is required of the truthmaking relation. Trenton Merricks (2007) has argued that truthmaking requires ‘aboutness’, in that X is a truthmaker for Y only if Y is about X. Mathematical claims are not about Socrates, and so Socrates cannot make them true. ‘God exists’ is about God, so only God is a candidate truthmaker for it. Those who accept Merricks’s proposal thereby avoid the problem of trivial truthmakers for necessary truths.

E. J. Lowe (2007) conceives of truthmaking as depending upon the essences of propositions. X is a truthmaker for Y only if it is part of the essence of Y that it be true should an object like X exist. This amendment solves the problem of trivial truthmakers because it is no part of the essence of the proposition expressed by ‘God exists’ that it be true should the copy of Lolita exist. The essence of the proposition that God exists has nothing to do with the rotting copy of Lolita, just as the proposition that two and two are four has nothing to do with Socrates. Lowe criticizes his own view on the grounds that it implies that propositions can be related to things that do not exist. For example, Batman could have been a truthmaker for ‘There are humans’, since the nature of the proposition that there are humans is such that it is true if things like Batman existed. So according to Lowe’s account, the proposition’s essence appears to stand in a relation to a non-existent entity, which is concerning for anyone who takes relations to entail the existence of their relata.

Regardless of how the problem of trivial truthmakers is solved, theorists seem to be agreed that the truthmaking relation, however ultimately analyzed, needs to be treated as a hyperintensional relation. That is, as a matter of necessity, a particular object could exist and a particular claim could be true in all the same possible worlds without that object being a truthmaker for the claim. Hence, truthmaking is a relation that is more discriminating than modal relations such as necessitation. Truthmaking is thus more like a dependence relation, or a grounding relation, than relations like necessitation or supervenience. Sometimes it is said that truthmaking is an ‘in virtue of’ relation: X is a truthmaker for Y because Y is true ‘in virtue’ of the existence of X (for example, Rodriguez-Pereyra 2006c). X is somehow ontologically responsible for the truth of Y, and no merely intensional relation is thought to capture this deeper connection between a truth and its truthmaker.

Some theorists accept that truthmaking is a kind of ‘in virtue of’ relation, but deny that it can be further analyzed. This is the view of, for example, Gonzalo Rodriguez-Pereyra (2006c), who holds that the truthmaking relation is a primitive notion that resists analysis.

In addition to the project of analyzing the components of the truthmaking relation (or admitting that such an analysis cannot be offered), there is also a question of what the structural and logical features of the relation are. One issue concerns the nature of the kinds of relata that the relation takes. The relation is typically understood to hold between a truth and a truthmaker. In this sense it is usually ‘cross-categorial’ in that it obtains between very different kinds of things, items from different categories. The truth that Socrates exists is made true by Socrates: here we have a case where the truthmaking relation obtains between a person and a truthbearer.

For many truthmaker theorists, there is no restriction on the kind of object that can be a truthmaker. To be a truthmaker, something just needs to appropriately account for the truth of some truthbearer. On this view, truthmakers are just whatever sorts of things are ontologically available. Other views impose restrictions. For example, one might argue that only facts or state of affairs are properly thought of as truthmakers. On this view, Socrates could not be a truthmaker for ‘Socrates exists’ because Socrates himself is not a fact or state of affairs. (At best he is a sort of abstraction from various states of affairs or facts.) There must be some other entity, such as the fact that Socrates exists, or a state of affairs composed by Socrates and an existence property, that makes the sentence true. Other views would find this perspective ontologically inflating: we do not need, in addition to Socrates, some further state of affairs that requires a property of existence in order to give an ontological account of the truth of ‘Socrates exists’. Finally, some have thought that only certain entities deserve to be thought of as truthmakers, such as fundamental entities (for example, Cameron 2008). On this view, X is a truthmaker for Y only if X is a fundamental entity.

As for the other side of the truthmaking relation, theorists disagree as to what sorts of objects are the bearers of truth. More restrictive views maintain that there is only one sort of truthbearer, or that there is only one primary kind of truthbearer, compared to which all other truthbearers are derivative. For example, a common view is that some sentence or belief bears truth only in virtue of expressing a true proposition, where propositions are the primary bearers of truth and falsity. More liberal views are happy to concede that there are a variety of truthbearers, and that they can all stand in the truthmaking relation. It is not clear that substantive questions about truthmaker theory turn on one’s background views about truthbearers, but it is wise to be sensitive to the ways in which truthmaking considerations might be affected by issues concerning truthbearers. For example, one could argue that while Socrates is a sufficient truthmaker for the proposition that Socrates exists (for it is impossible for Socrates to exist and yet that proposition be false), he is not a sufficient truthmaker for the sentence ‘Socrates exists’ because it is possible for Socrates to exist and yet the sentence be false, should the sentence have turned out to have a different meaning. For example, it is possible that ‘Socrates exists’ could have meant something else—such as that Socrates is Persian—and so it is possible that Socrates could have existed and that sentence be false. On this reading, then, one might take the truthmaker for the (uninterpreted) sentence to be more involved than the truthmaker for the proposition that sentence contingently expresses. What makes ‘Socrates exists’ true is Socrates plus whatever it is that makes it true that ‘Socrates exists’ means that Socrates exists.

Finally, consider some of the logical features of the truthmaking relation. In particular, there is the issue of how truthmaking stands with respect to reflexivity, symmetry, and transitivity. A relation is reflexive when every object that stands in the relation stands in the relation to itself. This would mean that every truth is its own truthmaker. The cross-categorial nature of truthmaking prohibits this possibility. Because not all truthmakers are truthbearers, the truthmaking relation is not reflexive.

Many theorists argue that truthmaking is irreflexive, in that there is no instance of something standing in the truthmaking relation to itself. (Hence, irreflexivity is stronger than the view that truthmaking is non-reflexive, which means that not every truth is its own truthmaker.) The general thought here is that truthmaking is a kind of dependence relation, and nothing can depend upon itself. But there are plausible counterexamples to irreflexivity. For example, the proposition that there are propositions appears to be a case of self-truthmaking. Because that proposition exists, it is true. One might respond by saying that the relation in this case actually holds between the existence of the proposition and the truth of the proposition, and so not between one and the same thing. This response, however, requires a substantial rethinking of the nature of the truthmaking relation (such that it no longer holds between truthmakers and truthbearers), and the apparent reification of properties like truth and existence.

Similar remarks apply to symmetry. A symmetric relation is one where if X bears it to Y, Y bears it to X. The cross-categorial nature of truthmaking again shows that the truthmaking relation is not in general symmetric. Not all truthmakers are truthbearers. But because some truthbearers can be truthmakers, the possibility for symmetry arises, in which case the relation is just non-symmetric. (Again, some will resist by suggesting that truthmaking, as a kind of dependence, must be anti-symmetric: if X depends on Y, Y does not depend on X.) In fact, any case of reflexive truthmaking will provide a case of symmetric truthmaking.

Finally consider transitivity: if X stands in R to Y, and Y stands in R to Z, then X stands in R to Z. Transitivity fails for obvious reasons. Socrates is a truthmaker for the proposition that Socrates exists, and the proposition that Socrates exists is a truthmaker for the proposition that there are propositions. But Socrates is no truthmaker for the proposition that there are propositions. Truthmaking is not transitive in general, but there could be individual instances of it (drawing on the same cases of reflexivity and symmetry).

3. Maximalism and Non-Maximalism

Another central question any truthmaker theorist must address concerns which truths have truthmakers. Perhaps all truths have truthmakers, or perhaps just some proper subset of the truths have truthmakers. Truthmaker maximalism is the thesis that all truths have truthmakers. Truthmaker non-maximalism maintains that there are truthmaker gaps: truths that have no truthmaker.

There have not been many arguments for maximalism. Its defenders frequently claim that the view is on its own quite intuitive and plausible. Resisting maximalism, according to such advocates, threatens to court the view that truths can ‘float free’ of reality. A truth without a truthmaker, on this view, is a brute truth, a truth for which there can be no explanation. Such truths, if they exist, are thought by maximalists to be metaphysically mysterious. Others have argued for maximalism by conceiving of having a truthmaker as being somehow essential to being true. If what it is to be true is to have a truthmaker, then something cannot be true without having a truthmaker. (The relationship between truth and truthmaking is further discussed in section 5.)

One motivation for non-maximalism is the existence of plausible counterexamples to the thesis that all truths have truthmakers. Consider negative existential truths, such as ‘There are no merlions’. On the face of it, the sentence is true not because some kind of thing exists; it is true because nothing of a different kind exists. A truthmaker for the negative existential would have to be some sort of entity whose existence excluded the existence of merlions, and explained their non-existence. But there is nothing in the world among the ‘positive’ entities that can guarantee that there are no merlions. Take, for example, the set of all the actually existing animals. Taken together, their existence does not guarantee the absence of merlions. For that set of animals could exist and yet it still be true that there are, in addition, merlions. It is only if we somehow combine the existence of those animals together with the fact that those animals are all the animals that we can find a suitable truthmaker for the negative existential.

Armstrong introduced a ‘totaling’ relation in response to these difficulties. For example, there is a state of affairs composed of the sum of all the animals standing in the totaling relation to the property of being an animal. This state of affairs fixes which animals exist, and so excludes the existence of any merlions. Armstrong generalizes this approach when he argues for the existence of what he calls the ‘totality state of affairs’. This is a second-order state of affairs that is composed of the sum of all the first-order state of affairs standing in the totaling relation to the property of being a first-order state of affairs. The existence of this second-order state of affairs thereby guarantees that the first-order states of affairs that partially compose it are all the first-order states of affairs there are. This single totality state of affairs can be a truthmaker for all negative existentials (and every other truth besides).

Like Russell’s negative facts, totality states of affairs are thought by many to be entities that are not fully ‘positive’. Their existence seems to concern what is not in addition to what is, and this is thought to be metaphysically suspicious. One way of putting the worry is that they are entities whose existence bears on the existence of things that are fully distinct from them. Ordinarily, one object’s existence does not bear on the existence of other objects that are separate from it. The existence of the Statue of Liberty neither entails nor excludes the existence of the Eiffel Tower. Neither does their existence exclude the existence of other potential landmarks that happen not to exist (such as a replica of the Statue of Liberty in Victoria Harbour). Totality states of affairs are different. The totality of animals excludes the existence of merlions, though merlions are entirely distinct from totalities of animals. For this reason, some philosophers have sought to develop non-maximalist approaches to truthmaker theory.

One prominent way of defending non-maximalism is to defend alternate principles that attempt to capture the dependence of truth upon being, but without admitting that all truths have truthmakers. One such principle is the thesis that truth supervenes on being, and it has been defended in both strong and weak versions. The strong version, defended by John Bigelow (1988), is the principle that if some proposition P is true at some world W₁ but not world W₂, then there must exist some entity at W₁ that does not exist at W₂, or some entity that exists at W₂ but not W₁. This principle captures the idea that what is true cannot vary from possible world to possible world unless there is some corresponding difference in the ontology of those worlds. Truth thus depends on being, although some truths escape having truthmakers. To see why, suppose that ‘There are merlions’ is false at W₁ but true at W₂. The principle implies that something must exist in one of these worlds but not the other. In this case, there is a merlion that exists at W₂ but does not exist at W₁. Although the negative existential ‘There are no merlions’ is true at W₁, it has no truthmaker in that world. Nevertheless, its truth depends on the ontology of the world in the sense that, had it been false, there would have been something in the world’s ontology (namely, a merlion) that it does not currently have.

David Lewis (2001) has defended a weaker supervenience principle. For Lewis, if some proposition P is true at some world W₁ but not world W₂, then either there must exist some entity at only one of the worlds, or some group of things must stand in some fundamental relation at one of the worlds but not the other. Like the strong supervenience principle, this weaker principle allows one to accept negative existentials as truthmaker gaps, but also allows one to treat contingent predications as truthmaker gaps. For example, suppose that while W₁ and W₂ contain all the same objects, they differ with respect to the properties those objects have. For example, suppose some object O is blue in W₁, but red in W₂. Because ‘O is blue’ is true in W₁ but false in W₂, the strong supervenience principle requires that there be some entity that exists in one of the worlds but not the other. But ex hypothesi the two worlds have the same ontology. The advocate of strong supervenience (alongside the maximalist) requires something like a blueness trope or state of affairs (that is, O’s being blue) to exist in W₁ but not W₂. The contingent predication still needs a truthmaker. The advocate of weak supervenience, by contrast, does not require the contingent predication to have a truthmaker. While there is no entity that guarantees the truth of ‘O is blue’ in W₁, its truth nevertheless depends on being in the sense that had it been false, there must be some difference in what exists, or in what properties those things have and what relations they stand in. The worlds where ‘O is blue’ is false are worlds where either O does not exist, or has different properties, such as being red.

Maximalism, strong supervenience, and weak supervenience are all attempts to capture the basic intuition behind truthmaker theory, and avoid the commitment to there being truths that ‘float free’ of reality. Some philosophers, however, have admitted that there are truths that do not depend on being at all, in any sense. Roy Sorensen (2001), for example, has argued that the puzzling truthteller sentence ‘This very sentence is true’ has a determinate truth value, but that it can never be known. Unlike the paradoxical liar sentence (‘This very sentence is false’), the truthteller is consistent: it can be true or false without contradiction. Sorensen argues that the truthteller is what we might call a deep truthmaker gap. Its truth does not depend on being in any sense, whereas shallow truthmaker gaps like contingent predications and negative existentials (if indeed they are truthmaker gaps) still in some sense depend on being. Sorensen argues that the truthteller’s status as a deep truthmaker gap explains why its truth value is unknowable: because we usually come to know truths by way of some kind of connection to their truthmakers, the fact that the truthteller (or its negation) lacks a truthmaker explains why we do not know its truth value.

Other forms of non-maximalism include the thesis that only ‘positive’ truths have truthmakers (however the positive/negative distinction may be articulated), that only synthetic truths have truthmakers, and that only contingent truths have truthmakers. It is incumbent upon theorists adopting such views that they explain why negative, analytic, or necessary truths are best thought of as not requiring truthmakers when accounting for their truth.

Finally, consider the following argument against maximalism, which does not turn at all on the plausibility of the various sorts of ontological truthmaking posits. Consider the sentence ‘This very sentence has no truthmaker’. This sentence is provably true (see Milne 2005). To see why, first suppose it is false. In that case, it has a truthmaker, in which case it is true: contradiction. So it must be true after all. Therefore, it has no truthmaker, since that is what it says about itself. It is a truthmaker gap. Here, simple reasoning leads to the view that there is at least one truth without a truthmaker. Many maximalists reject this argument (sometimes by assimilating it to the liar paradox), but nevertheless it remains to be seen where the reasoning goes wrong (see, for example, Rodriguez-Pereyra 2006a).

4. Kinds of Truthmakers

Truthmaker theorists are motivated by ontological questions: we can make progress on figuring out what exists by pursuing questions about what truthmakers there are. Considerations about truthmaking have thus led to different views about what exactly is included in the world’s ontology. These considerations often go hand in hand with the ancient metaphysical debate between realists and nominalists in discussions over the nature and existence of universals.

In his logical atomism, Russell just accepted as a truism the existence of facts, which are the sorts of things that make propositions true. Armstrong accepts the existence of similar objects, but he calls them ‘states of affairs’. A state of affairs is a complex object composed (in a non-mereological way) by a particular together with a universal. To offer a simplified example, suppose there is a universal of being a philosopher. Socrates instantiates this universal, and so in addition to the existence of Socrates and the universal, there is a third thing—we might call it ‘Socrates’s being a philosopher’—that is a kind of fusion of the other two.

Armstrong offers a truthmaking argument for the existence of states of affairs. It is true that Socrates is a philosopher. But Socrates does not make this claim true. Because the claim is a contingent predication, it is possible that Socrates could have existed and yet not been a philosopher. So Socrates does not necessitate the truth of ‘Socrates is a philosopher’, and so is not a truthmaker for the sentence. Nor does the universal being a philosopher necessitate ‘Socrates is a philosopher’, for it might have existed without Socrates being a philosopher. (Something else could have instantiated the universal.) Furthermore, not even the mereological sum of Socrates together with being a philosopher necessitates ‘Socrates is a philosopher’. For a world in which Socrates exists but is not a philosopher, though someone else is, is a world where the mereological sum exists but the sentence is false. On this basis, Armstrong argues that there must be something else, a state of affairs, that is a fusion of the particular and the property. Every world where the state of affairs composed by Socrates and being a philosopher is a world where ‘Socrates is a philosopher’ is true. On this basis, Armstrong defends the existence of states of affairs in the name of offering a satisfying truthmaker theory for contingent predications.

Similarly, Armstrong argues that we also need totality states of affairs in order to find truthmakers for negative and general truths. All the first-order states of affairs that exist are not enough to guarantee that there are no unicorns, or that all spheres of gold are less than a mile in diameter. So Armstrong posits the existence of a totaling relation, and second-order states of affairs partially composed by it. Again we see truthmaking considerations driving an ontological argument for the existence of entities that we might not ordinarily posit.

Not all truthmaker theorists accept Armstrong’s pro-universals and pro-states of affairs approach to truthmaker theory. Others have defended nominalist positions that reject the existence of universals, and so maintain the thesis that reality is exhausted by the particular. One popular ‘moderate’ form of nominalism is the view that there are tropes, which are individual, particularized property instances. Whereas the realist maintains that there is one unified thing, the universal of being a philosopher that is commonly instantiated by both Plato and Aristotle, the trope nominalist argues that there are two different ‘being a philosopher’ tropes: the trope associated with Plato is a distinct existence from the trope associated with Aristotle. Tropes, at least if thought of as essentially tied to their bearers, can serve as truthmakers for contingent predications. If Socrates’s being a philosopher trope exists, it must be true that Socrates is a philosopher. That trope, whose identity is bound up with Socrates, cannot in any sense be ‘transferred’ to Aristotle or anyone else. So tropes are sufficient necessitators for contingent predications. For those who find tropes ontologically advantageous over universals and states of affairs, this is a compelling argument. (It remains to be seen, however, whether trope theorists can provide truthmakers for negative and general truths, and so whether they must also, in the end, posit the existence of states of affairs.)

Another nominalist-friendly approach to truthmakers comes from David Lewis (2003), who uses counterpart theory to resist the above arguments for states of affairs and tropes. On Lewis’s view, an object exists in only one possible world, but has counterparts in different possible worlds. But there are multiple ways of thinking about objects, and so multiple ways of identifying an object’s counterparts. For example, we can use the name ‘Socrates qua philosopher’ to identify a series of counterparts to Socrates, all of whom are philosophers. Similarly, ‘Socrates qua Greek’ identifies Socrates in a way such that all his counterparts are Greek. Lewis next maintains that objects under counterpart relations can be truthmakers for contingent predications: every possible world in which Socrates qua philosopher exists is a world in which Socrates (or his counterpart) is a philosopher. So Lewis provides necessitating truthmakers for contingent predications without admitting the existence of tropes or states of affairs.

The previous arguments presuppose that contingent predications and/or negative and general truths require truthmakers. If they do, then truthmaker theorists are led to positing the existence of objects such as universals, tropes, states of affairs, and counterparts. A competing perspective, however, derives from a refusal to assume maximalist truthmaking principles, and so avoids such arguments. This alternative approach does not assume from the beginning that contingent predications and/or negative and general truths require truthmakers, and so is not ready to concede that we need an ontology of counterparts, tropes, or states of affairs. Instead of defending the existence of such entities, these truthmaker theorists defend the truth of non-maximalist truthmaker principles (as discussed in section 3). For example, advocates of the strong supervenience principle—that any difference in truth between two possible worlds requires a difference in ontology between the two worlds—believe that negative and general truths do not require truthmakers, and so, for example, Armstrong’s argument for totality states of affairs is unsuccessful. Similarly, advocates of the weak supervenience principle—that any difference in truths between two possible worlds requires either a difference in ontology or a difference in what fundamental relations objects stand in—argue that contingent predications do not require truthmakers, and so the arguments above do not succeed in showing that such posits exist.

5. Truthmaking Principles

Some very general and controversial principles concerning truthmaker theory have been canvassed above, such as maximalism, strong and weak supervenience, and principles concerning whether the truthmaking relation is irreflexive (or merely non-reflexive), asymmetric (or merely non-symmetric), or anti-transitive (or merely non-transitive). Other disputed truthmaking principles concern how truthmakers relate to one another, and what other logical principles apply in the theory of truthmaking.

One such principle in truthmaker theory is the entailment principle: if X is a truthmaker for Y, then X is a truthmaker for anything entailed by Y. For example, suppose that the state of affairs of Socrates’s being a philosopher exists, and is a truthmaker for ‘Socrates is a philosopher’. Because ‘Socrates is a philosopher’ entails ‘Something is a philosopher’, the entailment principle holds that the state of affairs of Socrates’s being a philosopher is also a truthmaker for ‘Something is a philosopher’. Furthermore, any other state of affairs involving the universal being a philosopher will also be a truthmaker for ‘Something is a philosopher’, since the truthmaking relation is not one-one.

While seemingly quite plausible, the entailment principle runs into an immediate difficulty: the problem of trivial truthmakers for necessary truths. ‘Socrates is a philosopher’ also entails ‘2 + 2 =4’, at least when entailment is thought of on the model of necessary truth preservation. Every world where ‘Socrates is a philosopher’ is true is a world where ‘2 + 2 = 4’ is true. But, presumably, the state of affairs of Socrates’s being a philosopher is not a truthmaker for ‘2 + 2 =4’, though the entailment principle suggests otherwise. In response, truthmaker theorists find ways to restrict the entailment principle, or offer alternate understandings of the kind of entailment in question. Generally speaking, truthmaker theorists attempt to articulate a hyperintensional account of entailment that is more modally discriminating than standard entailment. For example, one might think that some sort of relevance notion of entailment is at stake (for example, Restall 1996); the hope is to develop a conception of entailment that maintains that while ‘Socrates is a philosopher’ entails ‘Someone is a philosopher’, it does not entail ‘2 + 2 =4’.

Another plausible truthmaking principle—and one entailed by the entailment principle—is the conjunction principle. According to this principle, any truthmaker for a conjunction is also a truthmaker for the individual conjuncts. The conjunction principle follows from the entailment principle simply because conjuncts are entailed by the conjunctions they compose. While plausible, the principle has been doubted (for example, Rodriguez-Pereyra 2006c). The principle might seem appealing so long as we think of the truthmaking relation as tracking entailment relations. But recall that the truthmaking relation is not just a necessitation or entailment relation. As an ‘in virtue of’ relation, there is more to being a truthmaker than just being a necessitator. Take, for example, the conjunctive truth ‘Socrates exists and Aristotle exists’. A plausible truthmaker for this conjunction is the mereological sum composed by Socrates and Aristotle. If that sum exists, the conjunction has to be true. But is that mereological sum a truthmaker for the individual conjuncts? Put another way: is ‘Socrates exists’ true in virtue of the existence of the mereological sum Socrates + Aristotle? One might say: no, ‘Socrates exists’ is true in virtue of the existence of Socrates, period. The mereological sum, while a genuine necessitator of the truth of ‘Socrates exists’, is not the entity responsible for the sentence’s truth. The truthmaker for the conjunction, in effect, has ‘extraneous’ parts that are irrelevant to the truth of some of its conjuncts. Since truthmaking is thought of as a hyperintensional relation such that mere necessitation is not sufficient for truthmaking, there is room to doubt that Socrates + Aristotle is a genuine truthmaker for ‘Socrates exists’. Other philosophers who defend the conjunction principle may simply accept the sum as an adequate, albeit non-‘minimal’ truthmaker for the conjunct. (That is, the truthmaker has a proper part that is also a truthmaker.) After all, a truth may have multiple truthmakers on the standard view.

A similar candidate truthmaking principle is the disjunction principle: any truthmaker for a disjunction is a truthmaker for at least one of the disjuncts. For example, if Socrates is a truthmaker for ‘Socrates exists or Cthulhu exists’, then he is a truthmaker either for ‘Socrates exists’ or ‘Cthulhu exists’. The principle seems innocuous enough, until one considers necessary disjunctions of the form ‘p or it is not the case that p’. If one accepts the basic entailment principle, then any object whatsoever is a truthmaker for every claim of the form ‘p or it is not the case that p’. By the disjunction principle, any object whatsoever is therefore a truthmaker of either ‘p’ or ‘it is not the case that p’, depending upon which one is the true disjunct. As a result, every object is a truthmaker for every truth. This unfortunate result has led many to rethink the plausibility of the entailment and disjunction principles. (This problem may well be circumvented if a ‘relevance’ style amendment to the entailment principle is offered.)

A similar, but less controversial truthmaking principle about disjunction would be that any object that is a truthmaker for some truth is also a truthmaker for any disjunction that includes that truth as a disjunct. So since Socrates is a truthmaker for ‘Socrates exists’, he is also a truthmaker for ‘Socrates exists or Caesar sank in the Rubicon’. This sort of principle has been at work since the beginning of truthmaker theory; Russell (1985) relied on it when arguing that we need not posit a realm of disjunctive facts to make disjunctive propositions true. Atomic facts on their own suffice to serve as truthmakers for disjunctions.

6. Truthmaking and Truth

This section is a halfway house in the transition away from the internal concerns of truthmaker theory, and toward its external connections with other domains of philosophy, for it is controversial whether or not the theory of truth is a distinct domain from the theory of truthmakers. This section explores the relationship between the theory of truth and the theory of truthmakers, and surveys the possible attitudes one might take about their relationship to one another.

The history of truthmaker theory is inextricably linked with the correspondence theory. The metaphysical ambitions of Russell’s logical atomism are a natural extension of the correspondence theory of truth that he was beginning to accept around the same time period. Nowadays truthmaker theory is sometimes thought of as a modified, contemporary update of correspondence theory. It is no great mystery why. According to correspondence theories of truth, a proposition is true if and only if it stands in the correspondence relation to some worldly entity. (Oftentimes these entities are thought to be facts.) According to truthmaker theory, it seems that propositions are true if and only if they have a truthmaker; that is, a proposition is true just in case it stands in the truthmaking relation to some worldly entity, its truthmaker. If one identifies the truthmaking relation with the correspondence relation, and the set of truthmakers (facts or not) with the set of corresponding objects, then it certainly appears that truthmaker theory provides a correspondence-style theory of truth.

Notice that the above perspective presupposes maximalism. The only possible way of finding a theory of truth (let alone a correspondence theory of truth) inside truthmaker theory is to first commit to the thesis that every truth has a truthmaker. Any truthmaker gap would be an exception to anyone trying to explain the nature of truth by way of truthmakers. So the fact that maximalism is an optional requirement of truthmaker theory shows that taking truthmaker theory to be a theory of truth is also optional at best.

Even granting maximalism, anyone who seeks to define truth in terms of truthmakers still faces a crucial challenge. The truthmaking relation is itself typically understood in terms of truth. Truthmakers are objects that necessitate the truth of certain propositions, and not their other features. The accounts of the truthmaking relation canvassed in section 2 all presuppose the notion of truth. The essential dependence account, for example, holds that X is a truthmaker for Y only if Y is essentially such that it is true if X exists. Unless truthmaking can somehow be analyzed without further resort to truth, it cannot, on pain of circularity, be put to work in defining truth. Truth, it seems, is prior to truthmaking. Truthmaker theory presupposes the notion of truth, and so is not fit to serve as a theory of truth itself.

If truthmaker theory presupposes the notion of truth, does it presuppose any particular conception of truth? Again, many might think that truthmaker theory presupposes a correspondence theory of truth, or some similar substantive theory of truth. Several philosophers have also argued that truthmaker theory is incompatible with deflationary theories of truth (for example, Vision 2005). According to deflationary theories, truth is not a substantive property of propositions, in virtue of which they are true. The proposition that snow is white is not true in virtue of its having some property, or standing in a particular relation (for example, correspondence) to some object (or fact). Rather, the deflationist maintains, there is nothing more to the truth of the proposition that snow is white other than snow being white.

Accordingly, some might see deflationary theories of truth as containing an implicit rejection of truthmaker theory. As a result, truthmaker theory is incompatible with deflationary theories, and must presuppose some substantive theory of truth. (If not correspondence, there are coherence theories, pragmatic theories, epistemic theories, and others.) But it is not at all clear that anything in truthmaker theory conflicts with deflationary theories of truth. The latter tend to consist of axioms such as ‘The proposition that snow is white is true if and only if snow is white’ and ‘The proposition that Socrates is a philosopher is true if and only if Socrates is a philosopher’. These biconditionals themselves do not conflict with anything in truthmaker theory (or, typically, with any other theory of truth, either). Deflationists also maintain, in addition, that these axioms exhaust all there is to be said about the nature of truth. (It is this negative claim that substantive theories of truth must reject.) But truthmaker theorists need not be offering the claims of their theories as in any way revealing the nature of truth itself. To say that the truthmaker for the proposition that Socrates is a philosopher is a particular trope, state of affairs, or Socrates under a counterpart relation is not to say anything about the nature of truth itself. Rather, it is a claim about the particular ontological grounds needed for a particular claim about Socrates. In principle, truthmaker theorists and deflationists have nothing that they must disagree about.

7. Truthmaking and the Past

A longstanding metaphysical question concerns the reality of the past. Everyone can agree that entities in the present exist. But what about the objects that do not currently exist but someday will? And what about objects that used to exist but exist no longer? Presentism is the view that reality is exhausted by the present; the only things that exist are entities in the present. Eternalism, by contrast, is the view that there is no time limit on what exists: entities from the past are just as real as presently existing entities, which are just as real as future entities.

The existence of non-present entities is a highly contentious issue in philosophy. What is less controversial is the fact that there are, presently, truths about entities from the past. Presentists and eternalists disagree as to whether Socrates, a past entity, exists. But they agree that ‘Socrates existed’ is true. (What is more contentious is whether or not there are, right now, truths about the future. Parallel problems arise for those who think that there are truths about the future, but do not believe in the existence of purely future entities.) Eternalists face no difficulty in accounting for how such claims can be true. Socrates is the truthmaker for ‘Socrates existed’ in just the way that the Eiffel Tower is the truthmaker for ‘The Eiffel Tower exists’. Socrates and the Eiffel Tower are equally real, from the eternalist’s metaphysical point of view. One is located entirely in the past, and the other is located (but not entirely) in the present. But the present is not metaphysically privileged, so entities from the past and future are freely available to eternalists to serve as truthmakers.

Presentism, by contrast, faces a challenge from truthmaker theory. Given that there are truths about the past, but nothing (fully) from the past that exists, presentists are at pains when accounting for what, if anything, there is that can make those truths about the past true. Presentists have two available options: First, they can deny that truths about the past have truthmakers. Second, they can attempt to show that there are sufficient ontological resources in the present to ground the truths about the past.

Consider first the strategy of denying that truths about the past have truthmakers. This is a form of non-maximalism that limits truthmakers to truths about the present. Recall from section 3 that there are two distinct ways of conceiving of truthmaker gaps, that is, truths without truthmakers. There are deep truthmaker gaps, which are truths that do not depend in any way whatsoever upon what exists. Deep truthmaker gaps violate the principle that truth supervenes upon being: a deep truthmaker gap could be true in one world, but false in another, without there being any other difference between the two worlds. Shallow truthmaker gaps, by contrast, do not have truthmakers, but their truth is nonetheless ontologically accountable (by way, perhaps, of their adherence to one of the supervenience principles).

It appears that presentists cannot take advantage of the supervenience principles that have been defended by truthmaker theorists, and so appear to be forced into the view that if truths about the past are truthmaker gaps, they are deep truthmaker gaps. To see why, consider two presentist universes. These worlds are metaphysically indiscernible at the present moment: all the same things exist, and stand in the same fundamental relations. But they have different histories. In one of the universes, at some point some radioactive atom A decayed within its half-life, while a neighboring atom B did not. In the other universe, B decayed within its half-life, that is, within the predicted time it would take for half of a group of B-like atoms to radioactively decay, while A did not. So in the first universe, ‘A decayed within its half-life’ is true, while it is false in the second universe. But this difference has made no later difference in the histories of these universes, and so now, at present, the two universes are indiscernible. Yet something is true in one of them but not the other. So supervenience has been violated: they are discernible with respect to truth, but indiscernible with respect to being. Hence, presentists cannot defend a non-maximalist perspective on truths about the past without conceding that those truths are deep truthmaker gaps. But deep truthmaker gaps are highly unattractive—they make the truths in question brute, inexplicable truths. Given that eternalists have an easy, straightforward account of truthmakers for truths about the past, presentists face a serious objection. Presentists might respond by claiming that the supervenience principles need to be appropriately modified, such that truth supervenes on not just present being, but past being as well. But this response requires that present truths stand in relations to past entities, which is impossible for presentists who do not believe in past entities. If there are no past entities, there are no past entities for present truths to supervene upon.

The second strategy for presentism is to deny that there are no presently available truthmakers for truths about the past. On this kind of account, the burden is on the presentist to offer an ontological account of what present entities are available that can provide grounds for truths about the past. An eclectic menagerie of entities has been posited by presentists over the years to serve as truthmakers. Some have suggested that the world—the present world—has a variety of ‘tensed properties’ (for example, Bigelow 1996). For example, while echidnas make true ‘There are echidnas’, the world’s having the property there having been dinosaurs makes true ‘There were dinosaurs’. Others have posited a realm of ‘tensed facts’ (for example, Tallant 2009). A tensed fact is a sui generis entity posited solely to provide a truthmaker for past truths. So the truthmaker for ‘There were dinosaurs’ is on this view just an entity of some sort that we call ‘the fact that there were dinosaurs’. Still others have suggested that, for example, God’s memory of there being dinosaurs is a truthmaker for ‘There were dinosaurs’ (for example, Rhoda 2009).

Anyone can posit an entity to be a truthmaker. Such posits constitute a genuine solution to the truthmaking challenge to presentism only if those entities are the right sorts of entities to be truthmakers, and only if they are entities whose existence is plausible and can be independently motivated (lest they remain ad hoc posits). After all, the eternalist stands ready with plausible, independently motivated truthmakers. Hence, presentists do not need to just offer some account of truthmakers for past truths; they need to provide an equally good account.

Tensed facts fail both sorts of challenges. Consider Socrates’s last moments, as the hemlock spread through his blood. During those moments, ‘Socrates exists’ was true, and made true by Socrates. A few moments later, ‘Socrates existed’ is true, and made true by a tensed fact that has just sprung into existence. That two truths so similar should be made true by such drastically different entities should be fairly disquieting. Socrates seems to be the perfect sort of thing to explain why ‘Socrates exists’ is true. After all, the sentence is about Socrates, a human being, and so a human being seems fit to provide the grounds for its truth. ‘Socrates existed’ is also about a human being, but now the supposed truthmaker is some sort of sui generis entity, something that is certainly composed in no way by a human being. There is no independent reason to believe in tensed facts; they are put forward as truthmakers for truths about the past by brute force, since it is unclear what they are apart from their stipulated role of being truthmakers for truths about the past.

Tensed property views face a similar sort of objection. ‘Socrates exists’ is true at some moment in virtue of Socrates. ‘Socrates existed’ is true the next moment, but in virtue of the world’s having some tensed property. Why, one might wonder, is not ‘Socrates exists’ true, when it is true, in virtue of the world having the tensed property presently containing Socrates? If such properties are not motivated to account for the present, it is unclear why we should posit them to account for the past.

In general, any strategy using presently existing entities to make true truths about the past will face a common explanatory problem (Sanson and Caplan 2010). Why are truths about the past true in virtue of things in the present? After all, truths about the past seem to be about the past, and so it is unclear how anything not from the past could be an adequate explanation of why they are true. Truthmakers are not mere necessitators; they have to give the right sort of grounds for their truths. God’s memory of there being Socrates certainly necessitates the truth of ‘Socrates existed’. But it is fair to claim that ‘Socrates existed’ is not true in virtue of God’s having a particular memory. (To deny this seems to accept some form of divine idealism.) So God’s memories aren’t the right kind of thing to make true ‘Socrates existed’. (To the view’s credit, the existence of God’s memories can at least be motivated independently—for anyone motivated to believe in God. The view is obviously a non-starter for naturalistic metaphysics.)

8. Truthmaking and Modality

Another traditionally problematic domain of truths are the modal claims: claims involving possibility and necessity, as well as related kinds of claims such as counterfactuals. For example, there are claims about mere possibilities, that is, possibilities that do not obtain, but could have. There are also necessary and impossible truths, and truths that those truths are necessary or impossible. Since such claims appear to concern a realm beyond the actual world, the grounds for their truth have long intrigued metaphysicians.

Though defended independently of his views about truthmaking, David Lewis’s modal realism can be put to work as a theory of truthmakers for some modal truths. According to Lewis, there exists, in addition to the actual world, infinitely many other concrete worlds. These other possible worlds are just as real as the actual world; the actual world bears no special metaphysical significance. While objects exist only in one possible world, they have counterparts in other worlds. An object’s counterparts are the entities in other possible worlds that are highly similar to the object (where similarity is explicated contextually). These counterparts can serve as truthmakers for modal truths concerning the actual world. For example, Socrates could have been a sophist. What makes that true, Lewis could maintain, is one of Socrates’s sophistic counterparts. Because there exists a counterpart of Socrates that is a sophist, ‘Socrates could have been a sophist’ is true in the actual world. At the same time, this view might face a relevance objection: the truth in question is a claim about Socrates, so how could it be made true by some individual existing in a separate, causally isolated possible world?

Armstrong hopes for a more austere account of the truthmakers for truths of mere possibility. To do this, he defends the principle that any truthmaker for a contingent truth is also a truthmaker for the truth that that truth is contingent. So, if some object X is a truthmaker for some contingent proposition that p, then X is a truthmaker for the truth that it is contingent that p. And if it is contingent that p, it follows that it is possible that it is not the case that p. X will therefore provide a truthmaker for the truth of mere possibility (assuming the truth of the right sort of entailment principle). For example, Socrates might not have been a philosopher, even though he was. Suppose the truthmaker for ‘Socrates is a philosopher’ is the state of affairs of Socrates’s being a philosopher. In that case, Socrates’s being a philosopher also makes it true that it is contingent that Socrates is a philosopher. By the entailment principle, Socrates’s being a philosopher is also a truthmaker for the claim that it is possible that Socrates is not a philosopher. In this way, Armstrong defends an account of truthmakers for truths of mere possibilities that does not employ resources above and beyond the ordinary truthmakers needed to grounds truths solely about the actual world.

As for necessary truths (and claims that such truths are necessary), most truthmaker theorists are agreed that not just any old entity will do, since mere necessitation is not sufficient for truthmaking. If it is true that God exists, and necessarily so, then presumably God is the truthmaker for such claims, not every object whatsoever. What is more contentious is what it is that makes mathematical statements true. Platonists might defend their view on the basis that numbers, understood Platonically, are necessary for giving an account of truthmakers for mathematical truths (for example, Baron 2013). Others might hope for a non-Platonic basis for mathematical truthmakers. Since it is agreed that truthmakers need to be ‘about’ or relevant to their corresponding truths, non-Platonists face the challenge of explaining how their purported truthmakers ground the truth of claims that at least appear to concern Platonic entities.

There are many more modal cases to keep truthmaker theorists busy. There are truths of natural necessity (for example, that all copper conducts electricity), conceptual truths (for example, that all bachelors are male), and logical truths (for example, that someone is human only if someone is human). All pose unique challenges for truthmaker theory.

9. Objections to Truthmaker Theory

Many philosophers are unmoved by truthmaker theory. A common thread running between the various objections that have been raised is that truthmaker theory lacks the sufficient motivation that would be necessary to justify its ontological posits. Truthmaker theory traditionally defends the existence of ontologically controversial entities (such as states of affairs or tropes), and so such posits should figure into theories only when they have some indispensable theoretical role to play. And many are convinced that no such role exists.

One line of objection maintains that truthmaker principles that are weaker than maximalism are not worthy of the name, and that the ontological posits required for maximalism are unacceptable. So no form of truthmaker theory is tenable. (See, for example, Dodd 2002 and Merricks 2007.) Such objections rely on conceptions of truthmaker theory that are substantially narrower than what is actually found in the literature; non-maximalists will be unmoved by such supposed refutations. It is up to truthmaker theorists, not their opponents, to decide who counts as a truthmaker theorist.

Another common style of objection is to claim that the intuitions behind truthmaker theory can be saved far more economically by ontologically innocuous principles (for example, Hornsby 2005). As a result, the key but controversial principles supporting truthmaker theory (and the ontological results they produce) are unmotivated, and so should be rejected. The objection runs as follows. As above, a central motivating thought behind truthmaker theory is that truth depends on reality. Maximalists account for this intuition by way of requiring that every truth be made true by some entity, in virtue of which that truth is true. Non-maximalists might look to the strong or weak supervenience principles to explain how what is true is not independent from what exists and how those things are arranged. But other philosophers find these principles to be overreactions to the idea that truth depends on being. For these philosophers, that idea is best cashed out by pointing to the instances of the following schema:

The proposition that p is true because p.

For instance, the proposition that Socrates is a philosopher is true because Socrates is a philosopher. According to the objection, this ‘because principle’ suffices to explain how the truth of the proposition that Socrates is a philosopher depends upon reality. After all, this maneuver seems to capture the asymmetry between truth and reality. For instances of the reverse schema are false:

p because the proposition that p is true.

It is not the case that Socrates is a philosopher because the proposition that Socrates is a philosopher is true. Hence, there is no need to entertain the existence of a state of affairs or trope, and no need to posit general claims about the supervenience of truth on being.

The most natural response for truthmaker theorists to make is that the above ‘because principles’ remain silent on the questions of interest to truthmaker theorists. Advocates of the objection claim that such principles express the appropriate dependency between truth and reality. But there is no mention of reality anywhere in the principles. Consider what is being expressed by the ‘because principles’. They appear to apply a relation—the ‘because’ relation—between two sentences, or perhaps two propositions. The first sentence applies truth to a proposition; the second is just the use of a sentence that expresses that proposition. The ‘because principle’ cannot be expressing a relation involving entities such as facts or states of affairs, since the objector does not believe in the need for an ontology of those kinds of things. In fact, one can endorse a ‘because principle’ without taking any metaphysical or ontological stand about anything. The sentence ‘Socrates is a philosopher’ is completely silent on what exists. The sentence itself does not tell you what its ontological commitments are; one must bring to the sentence a theory of ontological commitment or truthmaking in order to determine what its metaphysical implications are. Presumably, advocates of the ‘because principles’ think that the used sentence following ‘because’ somehow involves reality. In so doing, they betray the fact that they are reading ontological implications already into the sentence. They are bringing, in other words, an implicit theory of truthmaking to the table.

Consider again the sorts of suspicious counterfactual conditionals that motivated truthmaker theory in the first place. The counterfactual ‘If I were to go to the quad I would have a tree-like sensory impression’ appears to be true, and true in virtue of the existence of a real, live tree in the quadrangle courtyard. That is the view that puts pressure on ontologies limited to actual sensory impressions: they have no available truthmakers for such counterfactuals, and so must take such claims to be primitive, brute truths. The objector to truthmaker theory points out that the proposition that if I were to go to the quad I would have a tree-like sensory impression is true because if I were to go to the quad I would have a tree-like sensory impression. That is true, but beside the point. It does not explain the need for something to exist in order for something to be true. We’re left wondering why I would have a tree-like sensory impression if I were to go to the quad. All the ‘because principle’ does (at least on the readings available to the objector) is cite a relation that obtains between two sentences or propositions; but truthmaker theorists are after a relation between truth and reality.

10. References and Further Reading

Armstrong, D. M. 2004. Truth and Truthmakers. Cambridge: Cambridge University Press.
- A systematic account of truthmaker theory by one of its most established proponents.
Baron, Sam. 2013. A truthmaker indispensability argument. Synthese 190: 2413-2427.
- Argues for mathematical Platonism on the basis of certain truthmaking considerations.
Beebee, Helen and Julian Dodd. 2005. Truthmakers: The Contemporary Debate. Oxford: Clarendon Press.
- An anthology of various essays both critical and supportive of truthmaker theory.
Bigelow, John. 1988. The Reality of Numbers: A Physicalist’s Philosophy of Mathematics. Oxford: Clarendon Press.
- Defends the strong supervenience principle, offering a non-maximalist approach to truthmaker theory.
Bigelow, John. 1996. Presentism and properties. Philosophical Perspectives 10: 35-52.
- Discusses the relationship between truthmaker theory and presentism; defends the view that truths about the past have truthmakers in the present.
Cameron, Ross P. 2008. Truthmakers and ontological commitment: or how to deal with complex objects and mathematical ontology without getting into trouble. Philosophical Studies 140: 1-18.
- Defends a view that requires truthmakers to be fundamental entities.
Caplan, Ben and David Sanson. 2011. Presentism and truthmaking. Philosophy Compass 6: 196-208.
- Provides an accessible introduction to presentism and truthmaker theory.
Dodd, Julian. 2002. Is truth supervenient on being? Proceedings of the Aristotelian Society (New Series) 102: 69-85.
- Argues that truthmaker theory is unmotivated.
Hornsby, Jennifer. 2005. Truth without truthmaking entities. In Truthmakers: The Contemporary Debate, eds. Helen Beebee and Julian Dodd, 33-47. Oxford: Clarendon Press.
- Argues that the intuitions behind truthmaking can be captured without resort to contentious ontological posits.
Lewis, David. 2001. Truthmaking and difference-making. Noûs 35: 602-615.
- Provides an important critical perspective on maximalist truthmaker theory that relies on defending the weak supervenience principle.
Lewis, David. 2003. Things qua truthmakers. In Real Metaphysics: Essays in Honour of D. H. Mellor, eds. Hallvard Lillehammer and Gonzalo Rodriguez-Pereyra, 25-42. London: Routledge.
- Provides a nominalist-friendly account of truthmaker theory that employs counterpart theory.
Lowe, E. J. 2009. An essentialist approach to truth-making. In Truth and Truth-Making, eds. E. J. Lowe and A. Rami, 201-216. Stocksfield: Acumen.
- Defends the view that the truthmaking relation is a kind of essential dependence.
Lowe, E. J. and A. Rami, eds. 2009. Truth and Truth-Making. Stocksfield: Acumen.
- An anthology of papers on truthmaker theory, including several on this list, that provides an introduction to core issues in truthmaker theory.
MacBride, Fraser. 2014. Truthmakers. In The Stanford Encyclopedia of Philosophy (Spring 2014 Edition), ed. Edward N. Zalta.
- Provides a detailed overview of several main theoretical concerns within truthmaker theory.
Mellor, D. H. 2003. Real metaphysics: replies. In Real Metaphysics: Essays in Honour of D. H. Mellor, eds. Hallvard Lillehammer and Gonzalo Rodriguez-Pereyra, 212-238. London: Routledge.
- Offers an argument that the truthmaking relation does not require necessitation.
Merricks, Trenton. 2007. Truth and Ontology. Oxford: Clarendon Press.
- Offers a sustained and ultimately negative critical evaluation of truthmaker theory.
Milne, Peter. 2005. Not every truth has a truthmaker. Analysis 65: 221-224.
- Raises a potential paradox for maximalism.
Molnar, George. 2000. Truthmakers for negative truths. Australasian Journal of Philosophy 78: 72-86.
- Introduces and discusses the problem of negative truths for truthmaker theory.
Mulligan, Kevin, Peter Simons and Barry Smith. 1984. Truth-makers. Philosophy and Phenomenological Research 44: 287-321.
- Offers a non-maximalist approach to truthmaker theory without resorting to states of affairs that begins by finding truthmakers for atomic facts.
Restall, Greg. 1996. Truthmakers, entailment and necessity. Australasian Journal of Philosophy 74: 331-340.
- Discusses problems (such as that related to the disjunction principle) with treating the truthmaking relation merely as a relation of necessitation.
Rhoda, Alan R. 2009. Presentism, truthmakers, and God. Pacific Philosophical Quarterly 90: 41-62.
- Posits the existence of God’s memories as providing presentist-friendly truthmakers for truths about the past.
Rodriguez-Pereyra, Gonzalo. 2006a. Truthmaker Maximalism defended. Analysis 66: 260-264.
- Defends truthmaker maximalism against Milne’s argument on the grounds that it begs the question.
Rodriguez-Pereyra, Gonzalo. 2006b. Truthmakers. Philosophy Compass 1: 186-200.
- Provides a highly accessible introduction to central issues in truthmaker theory.
Rodriguez-Pereyra, Gonzalo. 2006c. Truthmaking, entailment, and the conjunction thesis. Mind (New Series) 115: 957-982.
- Argues against certain core principles discussed in the truthmaking literature.
Russell, Bertrand. 1985. The Philosophy of Logical Atomism. ed. David Pears. La Salle, IL: Open Court.
- An early work that makes use of truthmaking ideas that gave rise to and inspired future contemporary work on truthmakers.
Ryle, Gilbert. 1949. The Concept of Mind. Chicago: University of Chicago Press.
- Presents Ryle’s behaviorism that becomes a later target of truthmaker theory.
Sanson, David and Ben Caplan. 2010. The way things were. Philosophy and Phenomenological Research 81: 24-39.
- Argues against various defenses of truthmakers for presentism on the ground that such posits are insufficiently explanatory.
Sorensen, Roy. 2001. Vagueness and Contradiction. Oxford: Clarendon Press.
- In the last chapter of this book Sorensen argues that the truthtelling sentence ‘This very sentence is true’ is a deep truthmaker gap: a truth without a truthmaker that depends in no way upon reality.
Tallant, Jonathan. 2009. Presentism and truth-making. Erkenntnis 71: 407-416.
- Discusses various strategies for presentist truthmaking.
Vision, Gerald. 2005. Deflationary truthmaking. European Journal of Philosophy 13: 364-380.
- Discusses the relationship between truthmaker theory and the deflationary theory of truth, and finds the two projects difficult to combine.

Author Information

Jamin Asay
Email: asay@hku.hk
University of Hong Kong
Hong Kong

Pejorative Language

Some words can hurt. Slurs, insults, and swears can be highly offensive and derogatory. Some theorists hold that the derogatory capacity of a pejorative word or phrase is best explained by the content it expresses. In opposition to content theories, deflationism denies that there is any specifically derogatory content expressed by pejoratives.

As noun phrases, ‘insult’ and ‘slur’ refer to symbolic vehicles designed by convention to derogate targeted individuals or groups. When used as verb phrases, ‘insult’ and ‘slur’ refer to actions performed by agents (Anderson and Lepore 2013b). Insulting or slurring someone does not require the use of language. Many different kinds of paralinguistic behavior could be used to insult_(verb)or slur_(verb) a targeted individual. Slamming a door in an interlocutor’s face is one way to insult them. Another way would be to sneer at them. Arguably, one could slur a Jewish person by performing a “Nazi salute” gesture in their presence. This article focuses on the linguistic meaning(s) that pejorative words encode as symbolic vehicles designed by convention to derogate (or harm) their targets.

Furthermore, it is important to delineate the differences between slurring and insulting. The latter is a matter of causing someone to be offended, where offense is a subjective psychological state (Hom 2012, p 397). Slurring, contrastly, does not require offending a target or eliciting any reaction whatsoever. For instance, the word ‘nigger’, used pejoratively at a Ku Klux Klan rally, derogates African Americans even if none are around to be offended by its use.

Desiderata
Content Theories
A Deflationary Theory
Broader Applications
References and Further Reading

1. Desiderata

This section focuses on five central features of pejoratives: practical features, descriptive features, embedded uses, expressive autonomy and appropriation. An explanation of these features is among the desiderata for an adequate theory of pejoratives.

a. Practical Features

There is a family of related practical features exhibited by pejoratives. First, pejoratives have the striking power to influence and motivate listeners. Insults and slurs can be used as tools for promoting destructive ways of thinking about their targets. Calling someone ‘loser’, for example, is a way of soliciting listeners to view them as undesirable, damaged, inferior, and so forth. Racial slurs have the function of propagating racism in a speech community. ‘Nigger’, for example, has the function of normalizing hateful attitudes and harmful discriminatory practices toward various “non-white” groups. Speakers have used the term to derogate African Americans, Black Africans, East Indians, Arabs, and Polynesians (among others). This is not to suggest that the derogation accomplished by means of pejoratives is always highly destructive. In some circumstances, insults like ‘asshole’ can be used to facilitate mild teasing between friends.

Second, some pejoratives tend to make listeners feel sullied. In some cases, merely overhearing a slur is sufficient for making a non-prejudiced listener feel complicit in a speaker’s slurring performance (Camp 2013) (Croom 2011). Third, different pejoratives vary in their levels of intensity (Saka 2007, p 148). For instance, ‘nigger’ is much more derogatory toward Blacks than ‘honky’ is toward Whites. Even different slurs for a particular group can vary in their derogatory intensity (for example, ‘nigger’ is more derogatory than ‘negro’). Further, pejoratives exhibit derogatory variation across time. While ‘negro’ was once used as a neutral classifying term, it is now highly offensive (Hom 2010, p 166). A successful theory of pejoratives will need to account for their various practical features.

b. Descriptive Features

Gibbard (2003) suggests that the notion of a thick ethical concept, due to Williams (1985), can shed light on the meaning of slurs. In comparison with thin ethical concepts (such as right and wrong, just and unjust), thick ethical concepts contain both evaluative and descriptive content. Paradigm examples include cruel, cowardly and unchaste. For Williams, terms that express thick ethical concepts not only play a role in prescribing and motivating action; they also purport to describe how things are. To say that a person is cruel, for example, is to say that they bring about suffering, and they are morally wrong for doing so. (For more on the distinction between thick and thin moral terms, see Metaethics.) According to Gibbard,

[r]acial epithets may sometimes work this way: where the local population stems from different far parts of the world, classification by ancestry can be factual and descriptive, but, alas, the terms people use for this are often denigrating. Nonracists can recognize things people say as truths objectionably couched. (2003, p 300)

Although Gibbard’s claim that slurring statements express truths that are “objectionably couched” is controversial, it does seem that slurs classify their respective targets. A speaker who calls an Italian person ‘spic’ does not merely say something offensive and derogatory – said speaker simulateneously makes a factual error in classifying his target incorrectly. Similarly, the insult ‘moron’ appears to both ascribe a low level of intelligence to its targets and evaluate them negatively for it. Additionally, as the following example illustrates, some swear words seem to contain descriptive content:

(1) A: Tom fucked Jerry for the first time last week.

B: No, they fucked for the second time last week; the first was two months ago.

Also, consider the following example (Hom 2010, p 170):

(2) Random fucking is risky behavior.

There appears to be genuine disagreement between A and B in (1) and someone who asserts (2) has surely made a claim capable of being true or false. A successful theory of pejoratives should explain, or explain away, apparent descriptive, truth-conditional features.

c. Embedded Uses

Potts (2007) observes that most pejoratives appear to exhibit nondisplaceability: the use of a pejorative is derogatory even as an embedded term in a larger construction. An indirect report or a conditional sentence are often vehicles of nondisplacibility; direct quotations, however, are excluded. Consider, for example, that Sue has uttered (3) and another speaker attempts to report on her utterance with (4):

(3) That asshole Steve is on time today.

(4) Sue said that that asshole Steve is on time today.

As long as the occurrence of ‘asshole’ is not read as implicitly metalinguistic—with a change in intonation or an accompanying gesture indicating that the speaker wishes to distance herself from any negative feelings toward Steve—listeners will interpret the speaker of (4) as making a disparaging remark about Steve, even if the speaker is merely attempting to report on Sue’s utterance.

Like the insult ‘asshole’, the gendered slur ‘bitch’ appears to scope out of indirect reports. Suppose Eric utters (5) and someone tries to report on his utterance with (6):

(5) A bitch ran for President of the United States in 2008.

(6) Eric said that a bitch ran for President of the United States in 2008.

It would be difficult to use (6) to give a neutral (non-sexist) report of Eric’s offensive claim. Unless a metalinguistic reading is available for the occurrence of ‘bitch’, anyone who utters (6) in an attempt to report on Eric’s utterance of (5) risks making an offensive claim about women (Anderson and Lepore 2013a, p 29).

Potts claims that one way in which pejoratives are nondisplaceable is that they always tell us about the current utterance situation (2007, p 169-71). Consider

(7) That bastard Kresge was late for work yesterday (#But he’s no bastard today, because he was on time)

Despite the fact that ‘bastard’ is within the scope of a tense operator in (7), it would be implausible to read the speaker as claiming that she disliked Kresge only in the past, as the defective parenthetical (indicated by the hash sign) illustrates.

However, not all pejoratives behave the same way when embedded. Consider (8)-(11):

(8) If Steve doesn’t finish his report by the end of the week, he’s fucked (but I suspect he’ll finish on time.)

(9) Suppose our new employee, Steve, is a bastard (On the other hand, maybe he’ll be nice)

(10) Steve is not a bastard (I think he’s a good guy).

(11) Steve used to be a real fucker in law school (but I like him much better now).

A speaker who utters (8)-(11) need not be said to have made a disparaging claim about Steve. This is because the occurrences of ‘fucked’, ‘bastard’, and ‘fucker’ in (8)-(11) appear to be “narrow-scoping” (Hom 2012, p.387). Thus, at least some embedded uses of pejoratives seem not to commit the speaker to an offensive claim (compare the non-defective parentheticals in these cases with the defective one in (7)).

Slurring words, however, appear to behave differently. As (12) and (13) illustrate, slurs are just as offensive and derogatory when uttered as part of a supposition or embedded in a conditional sentence as when they are used in predicative assertions:

(12) If the guys standing at the end of my driveway are spics, I’ll tell them to leave (#Fortunately, there is no such thing as a spic, since no one is inferior for being Hispanic)

(13) Suppose the next job applicant is a nigger. (#Of course that won’t happen, since no one is inferior for being Black.)

Notice the defectiveness of the parentheticals as attempts to cancel the derogatoriness of the preceding sentences. In general, slurs appear to take wide scope relative to all truth-conditional operators, including negation. Consider the following explicit attempt to reject a racist claim:

(14) It is not the case that Ben is a kike; he is not Jewish!

(14) fails to undermine the derogatoriness of the slur ‘kike’. Seemingly, the trouble is that it only disavows a derogatory way of thinking about Ben, and so it cannot be used to reject a racist attitude toward Jews in general (Camp 2013). Further, as Saka (2007, p 122) observes, even Tarskian disquotational sentences containing slurs appear to express hostility:

(15) “Nietzsche was a kraut” is true iff Nietzsche was a kraut.

A successful theory of pejoratives must explain the behavior of embedded pejorative words and phrases, and more specifically, must account for the fact that slurring words and insulting words appear to behave differently within the scope of truth-conditional and intensional operators. A successful theory must also resolve the apparent tension between the putative descriptive features of slurs, and their behavior under embedding.

d. Expressive Autonomy

The expressive power of a pejorative term is autonomous to the extent that it is independent of the attitudes of particular speakers who use the term. Slurring words appear to exhibit derogatory autonomy– their derogatory capacity is independent of the attitudes of speakers who use them (Hom 2008, p426). For instance, a racist who intends to express affection toward Italians by asserting, ‘I love wops; they are my favorite people on Earth’, has still used the slur in a patently offensive manner (Anderson and Lepore 2013a, p33). Likewise, a competent speaker who knows that ‘kike’ is a term of abuse for Jews could not stipulate a non-derogatory meaning by uttering, “What’s wrong with saying that kikes are smart? By ‘kike’, I just mean Jews, and Jews are smart, aren’t they?” (Saka 2007, p 148).

e. Appropriation

Some pejoratives are used systematically to accomplish aims other than those for which they were designed. Appropriation refers to the various systematic ways in which agents repurpose pejorative language. For certain slurs, the target group takes over the term to transform its meaning to lessen or to eliminate its derogatory force. This is one variety of appropriation known as linguistic reclamation(Brontsema 2004). The term ‘queer’ is a paradigm case. Although ‘queer’ once derogated those who engaged in sexually abnormal behavior, the term ‘queer’ now contains little to no derogatory force as a result of homosexual women and men appropriating the term. Now, non-prejudiced speakers can use the term ‘queer’ in a various contexts. For instance, phrases such as ‘queer studies program’ and ‘queer theory’ do not derogate homosexuals. In contrast, the slur ‘nigger’ -often marked by an alternative spelling ‘nigga’- has been appropriated more exclusively by the target group, and is often used as a means of expressing camaraderie between group members (Saka 2007, p145). Barring a few rare exceptions, targeted speakers can use the term to refer to one another in a non-denigrating way. Appropriated uses of ‘nigger’ are common in comedic performances and satire. The use of ‘nigger’ in a comedy bit designed to mock and criticize racism need not commit the speaker to racist attitudes (Richard 2008, p.12).

Insults are also subject to appropriation. In some contexts, an insult can be used to express something more jocular or affectionate than hateful, such as in the phrase: ‘George is the most lovable bastard I know’. A successful theory of these phenomena need to account for their various appropriated uses.

2. Content Theories

According to content theories, pejorative words are derogatory in virtue of the content they express. This section contains an overview and discussion of several content theories and their merits, followed by standard criticisms.

a. Pejorative Content as Fregean Coloring

For Gottlob Frege, two aspects to the meaning of a term are its reference and its sense. The reference is what the term denotes, while the sense provides instructions for picking out the reference. (For more on this distinction see Gottlob Frege: Language.) Additionally, Frege posited an expressive realm of meaning separate from sense and reference. For Frege, a word’s färbung (often translated as ‘coloring’ or ‘shading’) is constituted by the negative or positive psychological states associated with it that play no role in determining the truth-value of utterances that include it. The terms ‘dog’ and ‘cur’, for example, share the same sense and reference, but the latter has a negative coloring – something like disgust or contempt for the targeted canine (Frege 1892, p.240). Likewise for the neutral term ‘English’ and the slur ‘Limey’, which was once applied exclusively to English sailors, but now targets English people generally:

(16) Mary is English.

(17) Mary is a Limey.

For Frege, both (16) and (17) are true just in case Mary is English. However, for most speakers, ‘English’ is neutral in coloring, while ‘Limey’ is associated with negative feelings for English people.

Although the Fregean approach accounts for the descriptive features of pejoratives as well as the behavior of slurs when embedded, most contemporary theorists reject it (see, for example, Hom (2008)). For Frege, “coloring and shading are not objective, and must be evoked by each hearer or reader” (1897, p 155). On his view, a pejorative term’s coloring is not conventional (in any sense of the term); rather, coloring consists only in subjective (non-conventional) associations speakers have with the term. Dummett (1981) diagnoses the problem with positing an essentially subjective realm of meaning: the meaning of a linguistic sign or symbol cannot be in principle subjective, since it is what speakers conveyto listeners. Given the subjective nature of coloring, Fregeans are committed to holding that the derogatory power of slurs is due to subjective associations held by speakers and listeners. As a result, Fregeans will have difficulty accounting for expressive autonomy (Hom 2008, p 421). For instance, Fregeans will have trouble explaining why ‘nigger’ can be just as derogatory in the mouth of a racist as it is when uttered by a non-racist. In reply, Fregeans might offer a dispositional theory of coloring. Consider an analogy with a dispositional theory of color, according to which a thing is yellow, for example, if it disposes normal agents in appropriate conditions to have a qualitative experience of yellow. Similarly, Fregeans might hold that a slur S has a negative coloring to the extent that uttering or hearing S disposes speakers and listeners to have derogatory attitudes toward the target. This approach could generalize to other pejorative terms. Consider Frege’s example of ‘cur’. On the revised version of the theory, ‘cur’ has a negative coloring to the extent that competent listeners who hear the term predicated of a dog are disposed to think of the targeted canine as flea-ridden, mangy, and dangerous. Such an account might be promising, but much more would need to be said about how hearing the word disposes listeners to think in derogatory ways. As it stands, the Fregean view does little to explain how pejoratives can be so rhetorically powerful.

b. Expressivism

Another main theory of pejoratives is a descendent of the metaethical view known as expressivism. According to the version of expressivism developed by Ayer (1936), moral and aesthetic statements do not express propositions capable of being true or false, and merely serve to express and endorse the speaker’s own moral sentiments. For Ayer, an assertion of ‘Stealing is wrong’ does not express a truth-evaluable proposition; rather, it merely expresses the speaker’s disapproval of stealing. (For more on expressivism, see Non-Cognitivism in Ethics.)

One might extend Ayer’s expressivism to cover pejoratives. On this view, derogatory statements containing pejoratives do not express propositions capable of being true or false – they merely express a non-cognitive attitude, such as disapproval, of the target group. An expressivist theory of pejoratives is well suited to explain the behavior of slurs under embedding. However, it will have difficulty accounting for their descriptive features. As noted above, a speaker who calls an Italian person ‘spic’ has seemingly made a classificatory error. If slurs lack descriptive content, and merely serve to express non-cognitive attitudes, then it is unclear how they could classify their targets.

Saka (2007) offers an alternative, hybrid expressivist theory of slurs, according to which slurs contain both expressive and descriptive content (see also Kaplan (2004)). Saka denies that there is a single belief or proposition expressed by slurring statements such as ‘Nietzsche was a kraut’. Rather, such statements express an attitude complex, which includes (i) the pure belief that Nietzsche was German, and (ii) a cognitive-affective state toward Germans (Saka 2007, p 143). Saka’s hybrid theory could plausibly account for the descriptive, truth-conditional features of pejoratives.

However, it is not clear that either the pure expressivist theory of pejoratives or Saka’s hybrid theory can extend to all pejoratives. According to a standard objection to metaethical expressivism, the so-called Frege-Geach problem, one can utter a sentence containing a moral predicate (such as ‘good’, ‘evil’, ‘right’, and ‘wrong’) as the antecedent or consequent of a conditional sentence without making a moral judgment. Expressivists about moral terms are unable to account for the sameness of content in both asserted and non-asserted contexts, so the objection goes. For example, as Geach (1965) observed, the following is a valid argument:

(18) If tormenting the cat is wrong, then getting your brother to do it is wrong.

(19) Tormenting the cat is wrong.

(20) Therefore, getting your brother to do it is wrong.

If, as the metaethical expressivist claims, ‘wrong’ merely expresses a speaker’s approval, then it is a mystery how the term ‘wrong’ could carry the same content in (19) and when embedded in the antecedent of the conditional sentence in (18), given that (19) expresses a moral judgment while (18) does not. Hom (2010) argues that expressivist theories of swears face a similar challenge. Consider the following argument:

(21) If George fucked up his presentation, he will be fired.

(22) George fucked up his presentation.

(23) Therefore, he will be fired.

In order for this argument to be valid, the pejorative term ‘fucked’ must have the same semantic content in (21) and (22), despite the fact that (21) does not express a negative attitude about George, while (22) does. It is difficult to see how the pure expressivist theory could account for this. Although Saka’s hybrid theory has the potential to explain the preservation of content between (21) and (22), his view will have difficulty accounting for the fact that (21) expresses no negative attitude about George.

Additionally, one might worry that the non-cognitive attitudes posited by expressivism are too underspecified to account for derogatory variation (‘kraut’ is less derogatory than ‘nigger’ is, and so forth). Do all pejoratives express something like ‘contempt’ or ‘hostility’ or do the negative attitudes differ for each term? Saka claims that derogatory variation among slurs is due to the historical circumstances that led to their introduction and sustain their derogatory power (Saka 2007, p148). But the appeal to historical context here is illicit if the derogatoriness of slurs is to be explained by an attitude complex expressed by speakers who use the term. After appealing to external institutions to explain the derogatory features of slurs, it appears that the posited attitude complex has no remaining explanatory work.

Finally, expressivists need to do more to explain how the expression of negative attitudes relates to the practical features of slurs. In particular, they need to specify a notion of expression that makes it clear how the expression of hostility (or contempt, and so forth) toward a target could motivate listeners to feel similarly.

c. Slurs and Truth-Value Gaps

Richard (2008) holds that slurs express derogatory attitudes toward their targets, but unlike Saka he claims that slurs lack truth-conditional content. Richard is not a pure expressivist, since he does not take the derogatory content of slurs to be a negative affective state. He denies that slurring speech is false by claiming that to apply the term ‘false’ to an utterance is to claim that the speaker made an error that can be corrected by judicious use of negation. Nevertheless, examples like “My neighbor isn’t a chink; she’s Japanese,” suggest that it cannot. Richard also denies that derogatory statements containing slurs can be true. He acknowledges that predicating a slur of someone entails classifying him or her as a member of a particular group, but he denies that correct classification suffices for truth. For instance, Richard holds that the anti-Semite can correctly classify a person as Jewish by calling them ‘kike’, but when a speaker slurs a Jewish person with ‘kike’, they have not simply classified them as Jewish nor have they merely expressed an affective state (like hatred or contempt) – they have misrepresented the target as being despicable for being Jewish. According to Richard, we cannot endorse the classification as true without also endorsing the representation as accurate. On his view, whatever truth belongs to a classification is truth it inherited from the thought expressed in making it, and the thought expressed by the anti-Semite who uses the slur ‘kike’ is the mistaken thought that Jews are despicable for being Jewish (Richard 2008, p. 24).

Although Richard’s view could potentially make sense of the behavior of slurs under embedding, he does not offer a positive theory of how slurs represent their targets. He might hold that the relevant sort of representation is imagistic. Perhaps hearing a slur puts an unflattering image of the target group in the minds of listeners. In any event, Richard offers no help here. Instead, he is interested only in establishing that there are numerous statements – among them, derogatory statements containing slurs – that have a determinate content, yet are not truth-apt. Others include applications of vague predicates to borderline cases and statements that give rise to liar paradoxes. As it stands, nothing in Richard’s view helps us see how misrepresenting a target by means of calling them a pejorative word has the power to motivate listeners to think derogatory thoughts about them. Thus, Richard’s view leaves the practical features of slurs unexplained.

Further, there are reasons to be doubtful of Richard’s claim that slurs always misrepresent their targets. While this claim seems plausible in the case of racial slurs, it is not obviously true of all slurring words. Consider ‘fascist’, which is a slur for officials in an authoritarian political system. On Richard’s view, to call Mussolini and Hitler fascists is to represent them as contemptible for their political affiliation. Presumably, this would not be to misrepresent them. Richard might agree, and respond that the concept of truth is not what we should use when evaluating a slurring performance as accurate or inaccurate. In that case, Richard still owes a positive account of how such words can accurately represent their targets. Absent these details, it is difficult to evaluate Richard’s claims.

d. A Gestural Theory

Hornsby (2001) offers a theory of the derogatory content of slurs, but her view could be extended to cover other pejoratives:

It is as if someone who used, say, the word ‘nigger’ had made a particular gesture while uttering the word’s neutral counterpart. An aspect of the word’s meaning is to be thought of as if it were communicated by means of this (posited) gesture. The gesture is made, ineludibly, in the course of speaking, and is thus to be explicated…in illocutionary terms. (p 140)

According to Hornsby, the gestural content of a slur cannot be captured in terms of a proposition or thought. Rather, “the commitments incurred by someone who makes the gesture are commitments to targeted emotional attitudes” (2001, p140). Hornsby’s gestural theory has the potential to account for slurs’ expressive autonomy and their offensiveness when embedded. Unfortunately, Hornsby’s central thesis is unclear. On one interpretation, she holds that a speaker who uses a slur actually performs a pejorative gesture in the course of uttering it, although the gesture itself is elliptical. On another interpretation, she is claiming only that using a slur is analogous to performing a derogatory gesture. For either interpretation, there is a lacunae in Hornsby’s theory. If the first interpretation is what Hornsby intends, she owes an account of what the posited gestures are supposed to be. Perhaps she thinks that to call an African American ‘nigger’ is to perform an elliptical “throat slash” in their direction (Hom 2008, p418). Or maybe uttering ‘nigger’ amounts to giving targets “the finger”. If this is what Hornsby intends, she owes an account of how it is possible to perform an elided gesture. On the other hand, if Hornsby is merely claiming that derogatory uses of slurs are analogous to pejorative gestures, she needs to specify how tight the analogy is.

e. A Perspectival Theory

Camp (2013) offers a perspectival theory of slurs. On her view, slurs are so rhetorically powerful because they signal allegiance to a perspective, which is an integrated, intuitive way of thinking about the target group (p335). For Camp, a speaker who slurs some group G non-defeasibly signals his affiliation with a way of thinking and feeling about Gs as a whole (p340). The perspectival account offers an explanation for why slurs produce a feeling of complicity in their hearers, that is, why non-racist listeners tend to feel implicated in a speaker’s slurring performance. Camp describes two kinds of complicity. First, there is cognitive complicity:

The nature of semantic understanding, along with the fact that perspectives are intuitive cognitive structures only partially under conscious control, means that simply hearing a slur activates an associated perspective in the mind of a linguistically and culturally competent hearer. This in turn affects the hearer’s own intuitive patterns of thought: she now thinks about G’s in general, about the specific G (if any) being discussed, and indeed about anyone affiliated with Gs in the slurs’ light, however little she wants to. (p343)

Second, there is social complicity: the fact that there exists a word designed by convention to express the speaker’s perspective indicates that the perspective is widespread in the hearer’s culture, and being reminded of this may be painful for non-prejudiced listeners (Camp 2013, p344; see also Saka 2007, p 142). Camp’s theory also has the potential to explain linguistic reclamation. When a slur is taken over by its target group and its pejorative meaning is transformed; the derogatory perspective it once signaled becomes detached and the term comes to signal allegiance to a neutral (or positive) perspective on the target.

One might take issue with Camp’s claim that complicity is due to speakers signaling the presence of racist attitudes. In general, merely signaling one’s own perspective is insufficient for generating complicity. For instance, one might signal one’s libertarian political perspective by placing a ‘Ron Paul’ bumper sticker on one’s car, yet this behavior is not likely to make observers feel complicit in the expression of a libertarian perspective. Even signaling one’s racist attitudes need not lead others to feel complicit. For instance, one might overtly signal a racist perspective by refusing to sit next to members of a certain race on a bus or by crossing the street whenever a member of a certain race is walking toward them; however, in most cases, this sort of behavior is not likely to activate a derogatory perspective in observers or make observers feel complicit. Thus, the fact that slurs signal a derogatory perspective, if it is a fact, does not explain why slurs tend to make listeners feel complicit in the expression of a derogatory attitude.

f. Implicature Theories

In some cases, what a speaker means is not exhausted by what she literally says. Grice (1989) distinguishes what a speaker literally says with her words from what she implies or suggests with them. Grice posited two kinds of implicature: conversational and conventional. When a speaker communicates something by means of conversational implicature, she violates (or makes as if to violate) a conversational norm, such as provide as much information as is required given the aim of the conversation. The hearer, working on the assumption that the speaker is being cooperative, attempts to derive the implicatum (that is, what the speaker meant, but did not literally say) based on the words used by the speaker and what conversational norm she has (apparently) violated. Suppose that Professor A has written a letter of recommendation for her philosophy student, X, that reads, “Mr. X’s command of English is excellent, and his attendance at tutorials has been regular” (Grice 1989, p 33). The reader, recognizing that A does not wish to opt out, will observe that she has apparently violated the maxim of quantity: seemingly, she has not provided enough information about X’s philosophical abilities for the reader to make an assessment. The most reasonable explanation for A’s behavior is that she thinks X is a rather bad student, but is reluctant to explicitly say so, since doing so would entail saying something impolite or violating some other norm. According to Grice, sometimes the conventional meaning of a term determines what is implied by usage of the word, in addition to determining what is said by it. If a sentence s conventionally implies that Q, then it is possible to find another sentence s*, which is truth-conditionally equivalent to s, yet does not imply that Q. Consider the sentences ‘Alexis is rich and kind’ and ‘Alexis is rich, but kind’. For Grice, these two sentences have the same literal truth-conditions (they are true just in case Alexis is both rich and kind), but only the latter implies that there is a contrast between being rich and kind (in virtue of the conventional meaning of ‘but’). (For more on Grice’s theory of implicature, see Philosophy of Language.)

One might apply Grice’s theory of implicature to pejoratives. A theory that understands pejorative content as conversationally implicated content has little chance of succeeding. First, it seems that the pejorative meaning of a slur needn’t be worked out by the listener in the way that a conversational implicature must be (Saka 2007, p136). Second, conversational implicata are supposed to be cancellable, but the derogatory content of a slur is not (Hom 2008, p434). According to Grice, for any putative conversational implicatureP, it will always be possible to explicitly cancel P by adding something like ‘but not P’ or ‘I do not mean to imply that P’. And it is clear that the derogatory contents of slurs are not explicitly cancellable, as the following defective example illustrates: ‘That house is full of kikes, but I don’t mean to disparage Jewish people’.

Stenner (1981), Whiting (2007, 2013) and Williamson (2009) have argued that the derogatory content of some pejorative words and phrases is best understood in terms of conventional implicature. According to a conventional implicature account of slurs (hereafter, the ‘CI account’), slurs and their neutral counterparts have the same literal meaning, but slurs conventionally imply something negative that their neutral counterparts do not. For instance, ‘Franz was German’ and ‘Franz was a Boche’ are the same at the level of what is said – they have the same literal truth-conditions, that is, they are both true just in case Franz was German. But ‘Franz was a Boche’ conventionally implies the false and derogatory propositionthat Franz was cruel and despicable because he was German. One virtue of the CI account is that it explains the descriptive features of pejoratives as well as expressive autonomy.

One objection to the CI account is that it is controversial whether there is any such thing as conventional implicature. Bach (1999) argues that putative cases of conventional implicature are actually part of what is said by an utterance. Bach devised the indirect quotation (IQ) test for conventionally implicated content. Suppose that speaker A has uttered (24), and speaker B has reported on A’s utterance with (25):

(24) She is wise, but short.

(25) A said that she is wise and short.

According to Bach, since B has left out important information in her indirect report, namely information about the purported contrast between being wise and short, that information must have been part of what was said, as opposed to what was implied, by A’s utterance. Hom (2008) uses Bach’s IQ test to undermine the CI account of slurs. Suppose A uttered (26) and B reported on A’s utterance with (27):

(26) Bill is a spic.

(27) A said that Bill is Hispanic.

According to Hom, since B has misreported A, the derogatory content of the slur must be part of what is said, and so the CI account fails. Notice, however, that Hom’s use of Bach’s test does not show that the derogatoriness of slurs must be part of their literal semantic content, since “what is said” could refer to pragmatically enriched content (see, for example, Bach (1994)). A more serious objection is that even if Griceans are correct in holding that an utterance of ‘Italians are wops’ carries a negative implicature about Italians, more would need to be said in order to explain how implying something negative about Italians could bring about complicity in listeners, and motivate listeners to discriminate against Italians. Consider a paradigm case of conventional implicature: a speaker who asserts ‘P but Q’ commits herself to a contrast between P and Q by virtue of the conventional meaning of ‘but’. However, there is no reason to think that bystanders would automatically feel complicit in the speaker’s claim. Yet listeners often find themselves feeling complicit in the expression of a negative attitude just by overhearing a slur. Even if terms like ‘but’ are capable of triggering a kind of complicity, it is surely not the robust sort of complicity triggered by slurs. Potts (2007) offers a non-propositional version of the CI account. Potts understands pejorative content in terms of expressive indices, which model a speaker’s negative (or positive) attitudes in a conversational context. He offers the following schema for an expressive index:

<a I b>

where a and b are individuals, and I is an interval that represents a’s positive or negative feelings for b in the conversational context. The more narrow the interval, the more intense the feeling. If I = [-1, 1], then ais essentially indifferent toward b. If I = [0.8, 1], then a has a highly positive attitude toward b. If I = [-0.5, 0], then a has negative feelings for b. For Potts, the conventionally implicated content of a pejorative is a function that alters the expressive index of a conversational context. So, for example, if Bill calls George a ‘spic’, the expressive index might shift from <Bill [-1, 1] George>, where Bill is indifferent to George, to <Bill [-0.5, 0] George>, where Bill has negative feelings toward George. Potts’s theory could potentially account for complicity. He might argue that a feeling of complicity results from taking part in a conversation whose expressive index has been lowered due to the use of a slur. One problem with Potts’s theory is that expressive indices are supposed to measure psychological states of conversation participants, and these can depend on a variety of idiosyncratic features of the participants – their background beliefs, values, and so forth. This makes it difficult to see how the expressive content of pejoratives could be objective and speaker-independent (Hom 2010, p180).

Additionally, Potts’s numerical modeling of attitudes seems too coarse-grained to explain the differences between slurs and other pejoratives. One could shift the expressive index of a conversation by using an insult like ‘asshole’ or even by using non-pejorative language. For instance, Bill might lower the expressive index in a conversation about his colleague, George, by pointing out that George is late for work and that he’s not dressed appropriately for the office. Bill could also lower the index by uttering, ‘Here comes George!’ in a contemptuous tone of voice. If Potts is correct, the pejorative content of slurs like ‘nigger’, ‘chink’, and ‘spic’ should be understood in terms of expressive indices. However, in that case, Potts will have difficulty explaining the distinctively racist nature of these words, which derogate individuals qua members of particular racial groups.

g. A Presupposition Theory

In the philosophical literature, to presuppose a proposition P is to take P for granted in a way that contrasts with asserting that P (Soames 1989, p.553). According to one widely accepted theory, presupposed content is best understood in terms of attitudes and background beliefs of speakers. According to Robert Stalnaker’s theory of pragmatic presupposition, each conversation is governed by a conversational record, which includes the common ground, that is, the background assumptions mutually accepted by participants for the purposes of the conversation. The pragmatic presuppositions of an utterance are the requirements it places on sets of common background assumptions built up among conversational participants (Soames 1989, p.556). Mutually accepted background assumptions are subject to change over the course of a conversation. Lewis (1979) observes that information can be added to (or removed from) the conversational record when necessary in order to forestall presupposition failure and make what is said conversationally acceptable. For instance, if a speaker says, ‘Avery broke the copy machine’ in the course of a conversation, and it was not already mutually understood by the speaker and her listeners that a copy machine was damaged, then it will be assumed for the purposes of the conversation that some salient copy machine was broken. Schlenker (2007) argues that pejorative content is best understood in terms of presupposition. Consider how the presupposition theory covers slurs. Suppose (28) is asked in a conversation:

(28) Was there a honky on the subway today?

According to Schlenker, if none of the conversation participants dissent, a derogatory proposition (or set of such propositions) – for example, that Caucasians are despicable for being Caucasian, that the speaker and the audience are willing to treat Caucasians as despicable – is incorporated into common ground.

There are several problems with the presupposition theory of pejoratives. First, as Potts (2007), Hom (2010), and Anderson and Lepore (2013a) observe, presuppositions can be cancelled when sentences that trigger them are embedded in an indirect report, but the derogatoriness of embedded slurs cannot be cancelled. Compare (29) with (30):

(29) Frank believes that John stopped smoking, but John has never smoked.

(30) #Eric said that a nigger is in the white house, but Blacks are not inferior for being Black.

Ordinarily, an assertion of ‘John stopped smoking’ presupposes that John previously smoked. When embedded in an indirect report, however, the presupposition can be cancelled, as (29) illustrates. In contrast, (30) appears to convey something highly offensive, which cannot be cancelled by the right conjunct. If the presupposition account were correct, we would expect (30) to be inoffensive and non-derogatory. Also, as Richard (2008) has observed, derogation with slurs needn’t be a rational, cooperative effort between speakers. According to Richard,

[a] pretty good rule of thumb is that someone who is using these words is insulting and being hostile to their targets. But there is a rather large gap between doing that and putting something on the conversational record. If I yell ‘Smuck!’ at someone who cuts me off…[a]m I entitled to assume, if you don’t say ‘He’s not a smuck’, that you assume that the person in question is a smuck, or are hostile towards him? Surely not. (2008, pp.21-2)

h. Inferentialism

Inferentialism is the thesis that knowing the meaning of a statement is a matter of knowing the conditions under which one is justified in making the statement; and the consequences of accepting it, which include both the inferential powers of the statement and anything that counts as acting on the truth of the statement (Dummett 1981, p 453). In this view, one knows the meaning of the term ‘valid’, for example, if one knows the criteria for applying ‘valid’ to arguments, and one understanding the consequences of such an application, namely that an argument’s validity provides a basis for accepting its conclusion so long as one accepts its premises.

Dummett (1981) offers an inferentialist account of slurs (see also Tirrell (1999) and Brandom (2000)). Dummett posits two inference rules for slurs: an introduction rule and an elimination rule. The introduction rule gives sufficient conditions for applying the slur to someone and the elimination rule specifies what one commits oneself to by doing so. Consider the slur ‘boche’, which was once commonly applied to people of German origin:

The condition for applying the term to someone is that he is of German nationality; the consequences of its application are that he is barbarous and more prone to cruelty than other Europeans. We should envisage the connections in both directions as sufficiently tight as to be involved in the very meaning of the word: neither could be severed without altering its meaning (454).

Williamson (2009) formalizes Dummett’s inference rules for ‘boche’ as follows:

Boche introduction:

x is a German

Therefore, x is a boche

Boche elimination:

x is a boche

Therefore, x is cruel

Brandom (2000) endorses the inferentialist account of slurs, and notes a sense in which slurs are unsayable for non-prejudiced speakers. On his view, once one uses a term like ‘boche’, one commits oneself to the thought that Germans are cruel because of being German. The only recourse for non-xenophobic speakers, Brandom concludes, is to refuse to employ the concept, since it embodies an inference one does not endorse. The inferentialist theory is well suited to explain the descriptive features of slurs as well as expressive autonomy. The theory also accounts for why a slur is derogatory toward an entire group of individuals, even when a speaker intends only to derogate a single person in a particular context with the term.

However, there are numerous objections to the inferentialist’s treatment of slurs. First, Hornsby (2001) questions whether it is possible to spell out for every slur the consequences to which its users are committed. Further, as Williamson (2009) observes, a speaker might grow up in a community where only the pejorative word for a group is used. For instance, someone may only know Germans as people who are ‘boche’ without knowing the term ‘Germans’. In that case, the speaker could be competent with ‘boche’ (she could know that it is a xenophobic term of abuse) without knowing the word ‘German’. Thus, knowing the ‘boche-introduction’ rule is not necessary for competency with the slur.

i. Combinatorial Externalism

Hom (2008) offers a theory of the semantic content of slurs. According to Hom, the derogatory content of a pejorative term is wholly constituted by its literal meaning. Hom makes use of the semantic externalist framework first developed by Putnam (1975) and Kripke (1980). Semantic externalism holds that the internal state of the particular speaker of a word does not fully determine the word’s meaning, which is instead determined, at least partly, by external social practices of the linguistic community to which the word actively belongs. For more on semantic externalism, see Internalism and Externalism in the Philosophy of Mind and Language. According to Putnam (1975), one can competently use terms like ‘elm’ and ‘beech’ without understanding the complex biological properties of each kind of tree, as long as one stands in the right sort of causal relation to the social institutions that determine their meaning. Similarly, according to Hom, the meaning of a slur is determined by a social institution of racism that is constituted by a racist ideology and a set of harmful discriminatory practices. Hom offers the following formal schema for the semantic content of slurs:

Ought to be subject to p*₁ + … + p*_n because of being d*₁ + … + d*_nall because of being NPC*,

where p*₁ + … + p*_nare prescriptions for harmful discriminatory treatment derived from a set of racist practices, d*₁ + … + d*_nare negative properties derived from a racist ideology, and NPC* is the semantic value of the slur’s neutral counterpart (Hom 2008, p.431). Hom calls his view Combinatorial Externalism (CE). On this view, ‘chink’ expresses the following complex, socially constructed property as part of its literal meaning: ought to be subject to higher college admissions standards, excluded from managerial positions…, because of being slanty-eyed, devious…, all because of being Chinese.

According to Hom, one motivation for CE is that it accounts for the common intuition that slurs have empty extensions. A non-racist might say ‘There are no chinks; there are only Chinese.’ Given that no one ought to be subject to discriminatory practices because of their race, CE predicts that all racial slurs have null extensions. Hom’s semantic analysis also accounts for expressive autonomy, since the social institutions that determine the meanings of slurs are independent of the attitudes of particular speakers. Finally, CE accounts for non-derogatory, appropriated uses of slurs by in-group members. For Hom, when a targeted group appropriates a slur, they create a new supporting social institution for the term which imbues the term with a new (non-pejorative) semantic content.

Hom (2012) extends CE to cover swears. Consider Hom’s analysis of ‘John fucked Mary’:

to say that John fucked Mary is to say (something like) that they each ought to be scorned, ought to go to hell, ought to be treated as less desirable (if female), ought to be treated as more desirable (if male), ought to be treated as damaged (if female), …, for being sinful, unchaste, lustful, impure, … because of having sexual intercourse with each other. (Hom 2012, p 395)

In speech communities wherein ideologies support progressive ideas about sex and reject the meaning of the term, the term will come to have a different semantic content because the above prescriptions will no longer be a part of the semantic content of ‘fucked’.

CE faces several objections. First, the behavior of embedded slurs poses a problem for CE (see Richard (2008), Jeshion (2013) and Whiting (2013)). According to Hom (2012), derogation requires the actual predication of a slur to a targeted individual. Notice that a speaker who utters (31) has not literally assigned negative properties to anyone or prescribed negative practices for anyone, yet the utterance appears to be highly offensive and derogatory:

(31) If there are any spics among the job applicants, do not hire them.

If Hom is correct, non-prejudiced speakers should be able to endorse utterances like (31), since they would be true, given their false antecedents (Richard 2008, p17). In response, Hom (2012) suggests that wide-scoping intuitions about pejoratives can be explained by what they conversationally imply. (31) indicates that the speaker thinks that some Hispanic individuals are inferior and ought to be excluded from employment opportunities. However, if this is correct, there should be contexts where the speaker can felicitously follow up her utterance with ‘not that I mean to imply that Hispanic people are inferior or that they should be discriminated against’, since conversational implicata are cancellable. But the use of the slur in (31) seems non-defeasibly racist and derogatory. As Jeshion (2013) observes, following the utterance up with ‘but I don’t mean to imply anything derogatory’ does not get the speaker off the hook.

Finally, Jeshion (2013) objects that CE’s account of the semantic content of slurs has it backwards. She argues that ideologies and social practices must antedate slurs, and this is a problem because the use of a slur for a particular group often plays a role in the creation and development of such institutions and practices. If so, a social institution could not be the source of a slur’s pejorative content.

3. A Deflationary Theory

Anderson and Lepore (2013a, 2013b) deny that the characteristic features of slurs are due to the contents they express. Their proposal is simply that “slurs are prohibited words; as such, their uses are offensive to whomever these prohibitions matter” (2013a, p.21). Anderson and Lepore note that quotation does not always eliminate the offensiveness of pejoratives (see also Saka 1998, p.122). An utterance of (32), for example, would be offensive despite the quotational use of the slur it contains:

(32) ‘Nigger’ is a term for blacks.

Anderson and Lepore argue that content theorists will have difficulty accounting for the widespread practice of avoiding the word ‘nigger’ completely (using the locution ‘the N-word’ in place of quoting the term).

Deflationism accounts for the behavior of embedded slurs. However, it faces several objections. First, the theory offers little by way of an explanation of the practical features of slurs. (Croom 2011) Pointing out that slurs are prohibited words does not help us understand how they are such effective vehicles for spreading prejudice. Additionally, Whiting (2013) observes that it is possible for there to be slurs in the absence of taboos or social prohibitions. In a society in which the vast majority of speakers are prejudiced toward a particular racial group, and the targeted group members have internalized racist attitudes, it may be that no one objects to the use of slurs or finds them offensive, yet slurs might still be derogatory. Thus, social prohibitions cannot be all there is to the derogatoriness of slurs.

Finally, by defining slurs as merely prohibited words, Anderson and Lepore rule out a priori the possibility of slurs that are appropriate and morally permissible. One example might be ‘fascist’, which targets individuals based on political affiliation; using this slur to denigrate an authoritarian dictator need not (and perhaps should not) be prohibited.

4. Broader Applications

Since the 1980s, philosophical work on pejoratives has focused primarily on two questions: what (if anything) do pejoratives mean, and how is derogation by means of pejoratives accomplished? Researchers working on these questions would do well to familiarize themselves with empirical literature on pejoratives (for empirical studies on the behavioral and psychological effects of overhearing slurs, see Kirkland and others. 1987, Carnaghi and Maass 2007, and Gadon and Johnson 2009).

Work on slurs in the philosophy of language and linguistics has implications for debates in other disciplines. For instance, in answering the question of whether there should be legal restrictions on hate speech (which may involve the use of slurs), we will need to get clear on how hate speech harms its targets (Hornsby 2001). Legal theorists interested in these issues will want to pay careful attention to the literature discussed in this article. (For a discussion of whether laws against hate speech are justified, see Waldron 2012.)

5. References and Further Reading

Anderson, L. and E. Lepore 2013a, “Slurring Words,” Nous 47.1, 25-48
- [Offers a deflationary theory of slurs]
Anderson, L. and E. Lepore 2013b, “What Did you Call Me? Slurs as Prohibited Words: Setting Things Up,” Analytic Philosophy 54.3, 350-363.
- [Responds to objections to the deflationary theory defended in their 2013a]
Ayer, A. J. 1936, Language, Truth and Logic, Dover, New York.
- [Defends an expressivist theory of moral and aesthetic terms]
Bach, K. 1994, “Conversational Impliciture,” Mind and Language 9.2, 124-162.
- [Argues that Grice’s distinction between what a speaker literally says and what she implies is not exhaustive, and posits a third, intermediate category]
Bach, K. 1999, “The Myth of Conventional Implicature,” Linguistics and Philosophy 22.4, 327-366.
- [Argues that what is commonly held to be conventionally implicated content is actually part of what is said]
Brandom, R. 2000, Articulating Reasons: An Introduction to Inferentialism, Harvard University Press, Cambridge, MA.
- [Defends an inferentialist theory of slurs]
Brontsema, R. 2004, “A Queer Revolution: Reconceptualizing the Debate over Linguistic Reclamation,” Colorado Research in Linguistics 17.1, 1-17.
- [Gives an overview of the notion of linguistic appropriation as it applies to slurs]
Camp, E. 2013, “Slurring Perspectives,” Analytic Philosophy 54.3, 330-349.
- [Defends a perspectival theory of slurs]
Carnaghi, A. and A. Maass 2007, “In-Group and Out-Group Perspectives in the Use of Derogatory Group Labels: Gay versus Fag,” Journal of Language and Social Psychology 26.2, 142-156.
- [A study that measures the effects of slurs on targeted individuals compared with non-targets]
Croom, A.M. 2011, “Slurs,” Language Sciences 33, 343-358.
- [Offers a stereotype theory of slurs]
Dummett, M. 1981, Frege: Philosophy of Language 2^nd ed., Harvard University Press, Cambridge, MA.
- [Defends an inferentialist theory of slurs]
Frege, G. 1892, “On Sinn and Bedeutung,” in M. Beany (ed.) 1997, The Frege Reader Blackwell, Malden, MA, 151-171.
- [A classic paper in which Frege defends his theory of sense and reference]
Frege, G. 1897, “Logic,” in M. Beany (ed.) The Frege Reader, Blackwell, Malden, MA, 227-250.
- [Frege explicates his notion of “coloring”]
Gadon, O. and C. Johnson 2009, “The Effect of a Derogatory Professional Label: Evaluations of a “Shrink”,” Journal of Applied Social Psychology 39.3, 634-55.
- [Empirical study on the effects of overhearing a psychologist referred to as a ‘shrink’]
Geach, P. 1965, “Assertion,” Philosophical Review 69, 449-465.
- [Poses the famous Frege-Geach problem]
Gibbard, A. 2003, “Reasons Thick and Thin: A Possibility Proof,’ Journal of Philosophy 100.6, 288-304.
- [Argues that slurs are like thick evaluative terms in that they express both descriptive and evaluative content]
Grice, P. 1989, Studies in the Way of Words, Harvard University Press, Cambridge, MA.
- [A collection of papers on various topics in the philosophy of language]
Hom, C. 2008, “The Semantics of Racial Epithets,” Journal of Philosophy 105, 416-440.
- [Defends a truth-conditional, semantic theory of slurs]
Hom, C. 2010, “Pejoratives,” Philosophy Compass 5.2, 164-185.
- [Gives a general overview of various theories of pejoratives]
Hom, C. 2012, “A Puzzle about Pejoratives,” Philosophical Studies 159.3, 383-405.
- [Extends the semantic theory of slurs developed in his (2008) to swear words]
Hornsby, J. 2001, “Meaning and Uselessness: How to Think About Derogatory Words,” Midwest Studies in Philosophy 25, 128-141.
- [Defends a gestural theory of slurs]
Jeshion, R. 2013, “Slurs and Stereotypes,” Analytic Philosophy 54.3, 314-329.
- [Raises objections to the theories of slurs developed by Hom (2008) and Camp (2013)]
Kaplan, D. 2004, “The Meaning of Ouch and Oops” (unpublished transcription of the Howison Lecture in Philosophy at U.C. Berkeley)
- [Defends a broadly expressivist theory of pejoratives]
Kirkland, S., J. Greenberg, and T. Pyszczynski 1987, “Further Evidence of the Deleterious Effects of Overheard Ethnic Labels: Derogation Beyond the Target,” Personality and Social Psychology Bulletin 13.2, 216-227.
- [Empirical study on how overhearing the slur ‘nigger’ affects evaluations of those targeted by the slur]
Kripke, S. 1980, Naming and Necessity, Harvard University Press, Cambridge, MA.
- [Gives a defense of semantic externalism]
Lewis, D. 1979, “Scorekeeping in a Language Game,” Journal of Philosophical Logic 8, 339-359.
- [Offers a theory of conversational kinematics]
Neale, S. 1999, “Colouring and Composition,” in R. Stainton and K. Murasugi (eds.) Philosophy and Linguistics, Westview press, Boulder, CO, 35-82.
- [Explicates Frege’s notion of coloring]
Potts, C. 2007, “The Expressive Dimension,” Theoretical Linguistics 33.2, 255-268.
- [Offers a non-propositional version of the conventional implicature theory of slurs]
Putnam, H. 1975, “The Meaning of Meaning,” in Mind, Language and Reality: Philosophical Papers Volume 2, C.U.P., Cambridge: Cambridge, 215-271.
- [Offers a defense of semantic externalism]
Richard, M. 2008, When Truth Gives Out, Harvard University Press, Cambridge, MA.
- [Argues that utterances containing derogatory uses of slurs are not truth-apt]
Saka, P. 1998, “Quotation and the Use-Mention Distinction” Mind 107, 113-136.
- [Notes that quotation does not entirely eliminate the offensiveness of swear words]
Saka, P. 2007, How to Think About Meaning, Springer, Berlin.
- [Defends a hybrid expressivist theory of slurs]
Schlenker, P. 2007, “Expressive Presuppositions,” Theoretical Linguistics 33.2, 237-245.
- [Defends a presupposition theory of pejoratives]
Soames, S. 1989, “Presupposition,” in M. Gabbay and F. Guenthner (eds.) Handbook of Philosophical Logic, Kulwer, Dordrecht, 553-616.
- [Explicates the notion of linguistic presupposition]
Stenner, A.J. 1981, “A Note on Logical Truth and Non-Sexist Semantics,” in M. Vetterling-Braggin (ed.) Sexist Language: A Modern Philosophical Analysis, Littlefield, Adams and Co, New York, 299-306.
- [Defends a conventional implicature theory of slurs]
Tirrell, L. 1999, “Derogatory Terms,” in C. Hendriks and K. Oliver (eds.) Language and Liberation: Feminism, Philosophy and Language, SUNY Press, Albany, NY, 41-79.
- [Defends an inferentialist theory of slurs]
Waldron, J. 2012, The Harm in Hate Speech, Harvard University Press, Cambridge, MA.
- [Makes the case for legal restrictions on hate speech]
Whiting, D. 2007, “Inferentialism, Representationalism and Derogatory Words,” International Journal of Philosophical Studies 15.2, 191-205.
- [Offers a conventional implicature theory of slurs]
Whiting, D. 2013, “It’s Not What You Said, It’s the Way You Said It: Slurs and Conventional Implicature,” Analytic Philosophy 54.3, 364-377.
- [Responds to objections to the conventional implicature theory by Hom (2008) and others]
Williams, B. 1985, Ethics and the Limits of Philosophy, Harvard University Press, Cambridge, MA.
- [Explicates the notion of a thick evaluative term]
Williamson, T. 2009, “Reference, Inference and the Semantics of Pejoratives,” in J. Almog and P. Leonardi (eds.) The Philosophy of David Kaplan, Oxford University Press, Oxford, 137-158.
- [Raises objections to the inferentialist theory of slurs; defends a conventional implicature theory.]

Author Information

Ralph DiFranco
Email: ralph.difranco@uconn.edu
University of Connecticut
U. S. A.

Continental Rationalism

Continental rationalism is a retrospective category used to group together certain philosophers working in continental Europe in the 17^th and 18^th centuries, in particular, Descartes, Spinoza, and Leibniz, especially as they can be regarded in contrast with representatives of “British empiricism,” most notably, Locke, Berkeley, and Hume. Whereas the British empiricists held that all knowledge has its origin in, and is limited by, experience, the Continental rationalists thought that knowledge has its foundation in the scrutiny and orderly deployment of ideas and principles proper to the mind itself. The rationalists did not spurn experience as is sometimes mistakenly alleged; they were thoroughly immersed in the rapid developments of the new science, and in some cases led those developments. They held, however, that experience alone, while useful in practical matters, provides an inadequate foundation for genuine knowledge.

The fact that “Continental rationalism” and “British empiricism” are retrospectively applied terms does not mean that the distinction that they signify is anachronistic. Leibniz’s New Essays on Human Understanding, for instance, outlines stark contrasts between his own way of thinking and that of Locke, which track many features of the rationalist/empiricist distinction as it tends to be applied in retrospect. There was no rationalist creed or manifesto to which Descartes, Spinoza, and Leibniz all subscribed (nor, for that matter, was there an empiricist one). Nevertheless, with due caution, it is possible to use the “Continental rationalism” category (and its empiricist counterpart) to highlight significant points of convergence in the philosophies of Descartes, Spinoza, and Leibniz, inter alia. These include: (1) a doctrine of innate ideas; (2) the application of mathematical method to philosophy; and (3) the use of a priori principles in the construction of substance-based metaphysical systems.

Origin and History of the Term “Rationalism”
Innate Ideas
1. Descartes
2. Spinoza
3. Leibniz
4. Malebranche
Mathematical Method
A Priori Principles
1. Intelligibility and the Cartesian Circle
2. Substance Metaphysics
Continental Rationalism, Experience, and Experiment
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. Origin and History of the Term “Rationalism”

According to the Historisches Worterbuch der Philosophie, the word “rationaliste” appears in 16^th century France, as early as 1539, in opposition to “empirique.” In his New Organon, first published in 1620 (in Latin), Francis Bacon juxtaposes rationalism and empiricism in memorable terms:

Those who have treated of the sciences have been either empiricists [Empirici] or dogmatists [Dogmatici]. Empiricists [Empirici], like ants, simply accumulate and use; Rationalists [Rationales], like spiders, spin webs from themselves; the way of the bee is in between: it takes material from the flowers of the garden and the field; but it has the ability to convert and digest them. (The New Organon, p. 79; Spedding, 1, 201)

Bacon’s association of rationalists with dogmatists in this passage foreshadows Kant’s use of the term “dogmatisch” in reference, especially, to the Wolffian brand of rationalist philosophy prevalent in 18^th century Germany. Nevertheless, Bacon’s use of “rationales” does not refer to “Continental rationalism,” which developed only after the New Organon, but rather to the Scholastic philosophy that dominated the medieval period. Moreover, while Bacon is, in retrospect, often considered the father of modern empiricism, the above-quoted passage shows him no friendlier to the empirici than to the rationales. Thus, Bacon’s juxtaposition of rationalism and empiricism should not be confused with the distinction as it develops over the course of the 17^th and 18^th centuries, although his imagery is certainly suggestive.

The distinction appears in an influential form as the backdrop to Kant’s critical philosophy (which is often loosely understood as a kind of synthesis of certain aspects of Continental rationalism and British empiricism) at the end of the 18^th century. However, it was not until the time of Hegel in the first half of the 19^th century that the terms “rationalism” and “empiricism” were applied to separating the figures of the 17^th and 18^th centuries into contrasting epistemological camps in the fashion with which we are familiar today. In his Lectures on the History of Philosophy, Hegel describes an opposition between “a priori thought,” on the one hand, according to which “the determinations which should be valid for thought should be taken from thought itself,” and, on the other hand, “the determination that we must begin and end and think, etc., from experience.” He describes this as the opposition between “Rationalismus” and “Empirismus” (Werke 20, 121).

2. Innate Ideas

Perhaps the best recognized and most commonly made distinction between rationalists and empiricists concerns the question of the source of ideas. Whereas rationalists tend to think (with some exceptions discussed below) that some ideas, at least, such as the idea of God, are innate, empiricists hold that all ideas come from experience. Although the rationalists tend to be remembered for their positive doctrine concerning innate ideas, their assertions are matched by a rejection of the notion that all ideas can be accounted for on the basis of experience alone. In some Continental rationalists, especially in Spinoza, the negative doctrine is more apparent than the positive. The distinction is worth bearing in mind, in order to avoid the very false impression that the rationalists held to innate ideas because the empiricist alternative had not come along yet. (In general, the British empiricists came after the rationalists.) The Aristotelian doctrine, nihil in intellectu nisi prius in sensu (nothing in the intellect unless first in the senses), had been dominant for centuries, and it was in reaction against this that the rationalists revived in modified form the contrasting Platonic doctrine of innate ideas.

a. Descartes

Descartes distinguishes between three kinds of ideas: adventitious (adventitiae), factitious (factae), and innate (innatae). As an example of an adventitious idea, Descartes gives the common idea of the sun (yellow, bright, round) as it is perceived through the senses. As an example of a factitious idea, Descartes cites the idea of the sun constructed via astronomical reasoning (vast, gaseous body). According to Descartes, all ideas which represent “true, immutable, and eternal essences” are innate. Innate ideas, for Descartes, include the idea of God, the mind, and mathematical truths, such as the fact that it pertains to the nature of a triangle that its three angles equal two right angles.

By conceiving some ideas as innate, Descartes does not mean that children are born with fully actualized conceptions of, for example, triangles and their properties. This is a common misconception of the rationalist doctrine of innate ideas. Descartes strives to correct it in Comments on a Certain Broadsheet, where he compares the innateness of ideas in the mind to the tendency which some babies are born with to contract certain diseases: “it is not so much that the babies of such families suffer from these diseases in their mother’s womb, but simply that they are born with a certain ‘faculty’ or tendency to contract them” (CSM I, 304). In other words, innate ideas exist in the mind potentially, as tendencies; they are then actualized by means of active thought under certain circumstances, such as seeing a triangular figure.

At various points, Descartes defends his doctrine of innate ideas against philosophers (Hobbes, Gassendi, and Regius, inter alia) who hold that all ideas enter the mind through the senses, and that there are no ideas apart from images. Descartes is relatively consistent on his reasons for thinking that some ideas, at least, must be innate. His principal line of argument proceeds by showing that there are certain ideas, for example, the idea of a triangle, that cannot be either adventitious or factitious; since ideas are either adventitious, factitious, or innate, by process of elimination, such ideas must be innate.

Take Descartes’ favorite example of the idea of a triangle. The argument that the idea of a triangle cannot be adventitious proceeds roughly as follows. A triangle is composed of straight lines. However, straight lines never enter our mind via the senses, since when we examine straight lines under a magnifying lens, they turn out to be wavy or irregular in some way. Since we cannot derive the idea of straight lines from the senses, we cannot derive the idea of a true triangle, which is made up of straight lines, through the senses. Sometimes Descartes makes the point in slightly different terms by insisting that there is “no similarity” between the corporeal motions of the sense organs and the ideas formed in the mind on the occasion of those motions (CSM I, 304; CSMK III, 187). One such dissimilarity, which is particularly striking, is the contrast between the particularity of all corporeal motions and the universality that pure ideas can attain when conjoined to form necessary truths. Descartes makes this point in clear terms to Regius:

I would like our author to tell me what the corporeal motion is that is capable of forming some common notion to the effect that ‘things which are equal to a third thing are equal to each other,’ or any other he cares to take. For all such motions are particular, whereas the common notions are universal and bear no affinity with, or relation to, the motions. (CSM I, 304-5)

Next, Descartes has to show that the idea of a triangle is not factitious. This is where the doctrine of “true and immutable natures” comes in. For Descartes, if, for example, the idea that the three angles of a triangle are equal to two right angles were his own invention, it would be mutable, like the idea of a gold mountain, which can be changed at whim into the idea of a silver mountain. Instead, when Descartes thinks about his idea of a triangle, he is able to discover eternal properties of it that are not mutable in this way; hence, they are not invented (CSMK III, 184).

Since, therefore, the triangle can be neither adventitious nor factitious, it must be innate; that is to say, the mind has an innate tendency or power to form this idea from its own purely intellectual resources when prompted to do so.

Descartes’ insistence that there is no similarity between the corporeal motions of our sense organs and the ideas formed in the mind on the occasion of those motions raises a difficulty for understanding how any ideas could be adventitious. Since none of our ideas have any similarity to the corporeal motions of the sense organs – even the idea of motion itself – it seems that no ideas can in fact have their origin in a source external to the mind. The reason that we have an idea of heat in the presence of fire, for instance, is not, then, because the idea is somehow transmitted by the fire. Rather, Descartes thinks that God designed us in such a way that we form the idea of heat on the occasion of certain corporeal motions in our sense organs (and we form other sensory ideas on the occasion of other corporeal motions). Thus, there is a sense in which, for Descartes, all ideas are innate, and his tripartite division between kinds of ideas becomes difficult to maintain.

b. Spinoza

Per his so-called doctrine of “parallelism,” Spinoza conceives the mind and the body as one and the same thing, conceived under different attributes (to wit, thought and extension). (See Benedict de Spinoza: Metaphysics.) As a result, Spinoza denies that there is any causal interaction between mind and body, and so Spinoza denies that any ideas are caused by bodily change. Just as bodies can be affected only by other bodies, so ideas can be affected only by other ideas. This does not mean, however, that all ideas are innate for Spinoza, as they very clearly are for Leibniz (see below). Just as the body can be conceived to be affected by external objects conceived under the attribute of extension (that is, as bodies), so the mind can (as it were, in parallel) be conceived to be affected by the same objects conceived under the attribute of thought (that is, as ideas). Ideas gained in this way, from encounters with external objects (conceived as ideas) constitutes knowledge of the first kind, or “imagination,” for Spinoza, and all such ideas are “inadequate,” or in other words, confused and lacking order for the intellect. “Adequate ideas,” on the other hand, which can be formed via Spinoza’s second and third kinds of knowledge (reason and intuitive knowledge, respectively), and which are clear and distinct and have order for the intellect, are not gained through chance encounters with external objects; rather, adequate ideas can be explained in terms of resources intrinsic to the mind. (For more on Spinoza’s three kinds of knowledge and the distinction between adequate and inadequate ideas, see Benedict de Spinoza: Epistemology.)

The mind, for Spinoza, just by virtue of having ideas, which is its essence, has ideas of what Spinoza calls “common notions,” or in other words, those things which are “equally in the part and in the whole.” Examples of common notions include motion and rest, extension, and indeed God. Take extension for example. To think of any body – however small or however large – is to have a perfectly complete idea of extension. So, insofar as the mind has any idea of body (and, for Spinoza, the human mind is the idea of the human body, and so always has ideas of body), it has a perfectly adequate idea of extension. The same can be said for motion and rest. The same can also be said for God, except that God is not equally in the part and in the whole of extension only, but of all things. Spinoza treats these common notions as principles of reasoning. Anything that can be deduced on their basis is also adequate.

It is not clear if Spinoza’s common notions should be considered innate ideas. Spinoza speaks of active and passive ideas, and adequate and inadequate ideas. He associates the former with the intellect and the latter with the imagination, but “innate idea” is not an explicit category in Spinoza’s theory of ideas as it is in Descartes’ and also Leibniz’s. Common notions are not “in” the mind independent of the mind’s relation with its object (the body); nevertheless, since it is the mind’s nature to be the idea of the body, it is part of the mind’s nature to have common notions. Commentators differ over the question of whether Spinoza had a positive doctrine of innate ideas; it is clear, however, that he denied that all ideas come about through encounters with external objects; moreover, he believed that those ideas which do come about through encounters with external objects are of an inferior epistemic value than those produced through the mind’s own intrinsic resources; this is enough to put him in the rationalist camp on the question of the origin of ideas.

c. Leibniz

Of the three great rationalists, Leibniz propounded the most thoroughgoing doctrine of innate ideas. For Leibniz, all ideas are strictly speaking innate. In a general and relatively straightforward sense, this viewpoint is a direct consequence of Leibniz’s conception of individual substance. According to Leibniz, “each substance is a world apart, independent of everything outside of itself except for God. Thus all our phenomena, that is to say, all the things that can ever happen to us, are only the results of our own being” (L, 312); or, in Leibniz’s famous phrase from the Monadology, “monads have no windows,” meaning there is no way for sensory data to enter monads from the outside. In this more general sense, then, to give an explanation for Leibniz’s doctrine of innate ideas would be to explain his conception of individual substance and the arguments and considerations which motivate it. (See Section 4, b, iii, below for a discussion of Leibniz’s conception of substance; see also Gottfried Leibniz: Metaphysics.) This would be to circumvent the issues and questions which are typically at the heart of the debate over the existence of innate ideas, which concern the extent to which certain kinds of perceptions, ideas, and propositions can be accounted for on the basis of experience. Although Leibniz’s more general reasons for embracing innate ideas stem from his unique brand of substance metaphysics, Leibniz does enter into the debate over innate ideas, as it were, addressing the more specific questions regarding the source of given kinds of ideas, most notably in his dialogic engagement with Locke’s philosophy, New Essays on Human Understanding.

Due to Leibniz’s conception of individual substance, nothing actually comes from a sensory experience, where a sensory experience is understood to involve direct concourse with things outside of the mind. However, Leibniz does have a means for distinguishing between sensations and purely intellectual thoughts within the framework of his substance metaphysics. For Leibniz, although each monad or individual substance “expresses” (or represents) the entire universe from its own unique point of view, it does so with a greater or lesser degree of clarity and distinctness. Bare monads, such as comprise minerals and vegetation, express the rest of the world only in the most confused fashion. Rational minds, by contrast, have a much greater proportion of clear and distinct perceptions, and so express more of the world clearly and distinctly than do bare monads. When an individual substance attains a more perfect expression of the world (in the sense that it attains a less confused expression of the world), it is said to act; when its expression becomes more confused, it is said to be acted upon. Using this distinction, Leibniz is able to reconcile the terms of his philosophy with everyday conceptions. Although, strictly speaking, no monad is acted upon by any other, nor acts upon any other directly, it is possible to speak this way, just as, Leibniz says, Copernicans can still speak of the motion of the sun for everyday purposes, while understanding that the sun does not in fact move. It is in this sense that Leibniz enters into the debate concerning the origin of our ideas.

Leibniz distinguishes between “ideas” (idées) and “thoughts” (pensées) (or, sometimes, “notions” (notions) or “concepts” (conceptus)). Ideas exist in the soul whether we actually perceive them or are aware of them or not. It is these “ideas” that Leibniz contends are innate. “Thoughts,” by contrast is Leibniz’s designation for ideas which we actually form or conceive at any given time. In this sense, “thoughts” can be formed on the basis of a sensory experience (with the above caveats regarding the meaning a sensory experience can have in Leibniz’s thought) or on the basis of an internal experience, or a reflection. Leibniz alternatively characterizes our “ideas” as “aptitudes,” “preformations,” and as “dispositions” to represent something when the occasion for thinking of it arises. On multiple occasions, Leibniz uses the metaphor of the veins present in marble to illustrate his understanding of innate ideas. Just as the veins dispose the sculptor to shape the marble in certain ways, so do our ideas dispose us to have certain thoughts on the occasion of certain experiences.

Leibniz rejects the view that the mind cannot have ideas without being aware that it has them. (See Gottfried Leibniz: Philosophy of Mind.) Much of the disagreement between Locke and Leibniz on the question of innate ideas turns on this point, since Locke (at least as Leibniz represents him in the New Essays) is not able to make any sense out of the notion that the mind can have ideas without being aware of them. Much of Leibniz’s defense of his innate ideas doctrine takes the form of replying to Locke’s charge that it is absurd to hold that the mind could think (that is, have ideas) without being aware of it.

Leibniz marshals several considerations in support of his view that the mind is not always aware of its ideas. The fact that we can store many more ideas in our understanding than we can be aware of at any given time is one. Leibniz also points to the phenomenology of attention; we do not attend to everything in our perceptual field at any given time; rather we focus on certain things at the expense of others. To convey a sense of what it might be like for the mind to have perceptions and ideas in a dreamless sleep, Leibniz asks the reader to imagine subtracting our attention from perceptual experience; since we can distinguish between what is attended to and what is not, subtracting attention does not eliminate perception altogether.

While such considerations suggest the possibility of innate ideas, they do not in and of themselves prove that innate ideas are necessary to explain the full scope of human cognition. The empiricist tends to think that if innate ideas are not necessary to explain cognition, then they should be abandoned as gratuitous metaphysical constructs. Leibniz does have arguments designed to show that innate ideas are needed for a full account of human cognition.

In the first place, Leibniz recalls favorably the famous scenario from Plato’s Meno where Socrates teaches a slave boy to grasp abstract mathematical truths merely by asking questions. The anecdote is supposed to indicate that mathematical truths can be generated by the mind alone, in the absence of particular sensory experiences, if only the mind is prompted to discover what it contains within itself. Concerning mathematics and geometry, Leibniz remarks: “one could construct these sciences in one’s study and even with one’s eyes closed, without learning from sight or even from touch any of the needed truths” (NE, 77). So, on these grounds, Leibniz contends that without innate ideas, we could not explain the sorts of cognitive capacities exhibited in the mathematical sciences.

A second argument concerns our capacity to grasp certain necessary or eternal truths. Leibniz says that necessary truths can be suggested, justified, and confirmed by experience, but that they can be proved only by the understanding alone (NE, 80). Leibniz does not explain this point further, but he seems to have in mind the point later made by both Hume and Kant (to different ends), that experience on its own can never account for the kind of certainty that we find in mathematical and metaphysical truths. For Leibniz, if it can be granted that we can be certain of propositions in mathematics and metaphysics – and Leibniz thinks this must be granted – recourse must be had to principles innate to the mind in order to explain our ability to be certain of such things.

d. Malebranche

It is worth noting briefly the position of Nicolas Malebranche on innate ideas, since Malebranche is often considered among the rationalists, yet he denied the doctrine of innate ideas. Malebranche’s reasons for rejecting innate ideas were anything but empiricist in nature, however. His leading objection stems from the infinity of ideas that the mind is able to form independently of the senses; as an example, Malebranche cites the infinite number of triangles of which the mind could in principle, albeit not in practice, form ideas. Unlike Descartes and Leibniz, who view innate ideas as tendencies or dispositions to form certain thoughts under certain circumstances, Malebranche understands them as fully formed entities that would have to exist somehow in the mind were they to exist there innately. Given this conception, Malebranche finds it unlikely that God would have created “so many things along with the mind of man” (The Search After Truth, p. 227). Since God already contains the ideas of all things within Himself, Malebranche thinks that it would be much more economical if God were simply to reveal to us the ideas of things that already exist in him rather than placing an infinity of ideas in each human mind. Malebranche’s tenet that “we see all things in God” thus follows upon the principle that God always acts in the simplest ways. Malebranche finds further support for this doctrine from the fact that it places human minds in a position of complete dependence on God. Thus, if Malebranche’s rejection of innate ideas distinguishes him from other rationalists, it does so not from an empiricist standpoint, but rather because of the extent to which his position on ideas is theologically motivated.

3. Mathematical Method

In one sense, what it means to be a rationalist is to model philosophy on mathematics, and, in particular, geometry. This means that the rationalist begins with definitions and intuitively self-evident axioms and proceeds thence to deduce a philosophical system of knowledge that is both certain and complete. This at least is the goal and (with some qualifications to be explored below) the claim. In no work of rationalist philosophy is this procedure more apparent than in Spinoza’s Ethics, laid out famously in the geometrical manner (more geometrico). Nevertheless, Descartes’ main works (and those of Leibniz as well), although not as overtly more geometrico as Spinoza’s Ethics, are also modelled after geometry, and it is Descartes’ celebrated methodological program that first introduces mathematics as a model for philosophy.

a. Descartes

Perhaps Descartes’ clearest and most well-known statement of mathematics’ role as paradigm appears in the Discourse on the Method:

Those long chains of very simple and easy reasonings, which geometers customarily use to arrive at their most difficult demonstrations, had given me occasion to suppose that all the things which can fall under human knowledge are interconnected in the same way. (CSM I, 120)

However, Descartes’ promotion of mathematics as a model for philosophy dates back to his early, unfinished work, Rules for the Direction of the Mind. It is in this work that Descartes first outlines his standards for certainty that have since come to be so closely associated with him and with the rationalist enterprise more generally.

In Rule 2, Descartes declares that henceforth only what is certain should be valued and counted as knowledge. This means the rejection of all merely probable reasoning, which Descartes associates with the philosophy of the Schools. Descartes admits that according to this criterion, only arithmetic and geometry thus far count as knowledge. But Descartes does not conclude that only in these disciplines is it possible to attain knowledge. According to Descartes, the reason that certainty has eluded philosophers has as much to do with the disdain that philosophers have for the simplest truths as it does with the subject matter. Admittedly, the objects of arithmetic and geometry are especially pure and simple, or, as Descartes will later say, “clear and distinct.” Nevertheless, certainty can be attained in philosophy as well, provided the right method is followed.

Descartes distinguishes between two ways of achieving knowledge: “through experience and through deduction […] [W]e must note that while our experiences of things are often deceptive, the deduction or pure inference of one thing from another can never be performed wrongly by an intellect which is in the least degree rational […]” (CSM I, 12). This is a clear statement of Descartes’ methodological rationalism. Building up knowledge through accumulated experience can only ever lead to the sort of probable knowledge that Descartes finds lacking. “Pure inference,” by contrast,” can never go astray, at least when it is conducted by right reason. Of course, the truth value of a deductive chain is only as good as the first truths, or axioms, whose truth the deductions preserve. It is for this reason that Descartes’ method relies on intuition as well as deduction. Intuition provides the first principles of a deductive system, for Descartes. Intuition differs from deduction insofar as it is not discursive. Intuition grasps its object in an immediate way. In its broadest outlines, Descartes’ method is just the use of intuition and deduction in the orderly attainment and preservation of certainty.

In subsequent Rules, Descartes goes on to elaborate a more specific methodological program, which involves reducing complicated matters step by step to simpler, intuitively graspable truths, and then using those simple truths as principles from which to deduce knowledge of more complicated matters. It is generally accepted by scholars that this more specific methodological program reappears in a more iconic form in the Discourse on the Method as the four rules for gaining knowledge outlined in Part 2. There is some doubt as to the extent to which this more specific methodological program actually plays any role in Descartes’ mature philosophy as it is expressed in the Meditations and Principles (see Garber 2001, chapter 2). There can be no doubt, however, that the broader methodological guidelines outlined above were a permanent feature of Descartes’ thought.

In response to a request to cast his Meditations in the geometrical style (that is, in the style of Euclid’s Elements), Descartes distinguishes between two aspects of the geometrical style: order and method, explaining:

The order consists simply in this. The items which are put forward first must be known entirely without the aid of what comes later; and the remaining items must be arranged in such a way that their demonstration depends solely on what has gone before. I did try to follow this order very carefully in my Meditations […] (CSM II, 110)

Elsewhere, Descartes contrasts this order, which he calls the “order of reasons,” with another order, which he associates with scholasticism, and which he calls the “order of subject-matter” (see CSMK III, 163). What Descartes understands as “geometrical order” or the “order of reasons” is just the procedure of starting with what is most simple, and proceeding in a step-wise, deliberate fashion to deduce consequences from there. Descartes’ order is governed by what can be clearly and distinctly intuited, and by what can be clearly and distinctly inferred from such self-evident intuitions (rather than by a concern for organizing the discussion into neat topical categories per the order of subject-matter)

As for method, Descartes distinguishes between analysis and synthesis. For Descartes, analysis and synthesis represent different methods of demonstrating a conclusion or set of conclusions. Analysis exhibits the path by which the conclusion comes to be grasped. As such, it can be thought of as the order of discovery or order of knowledge. Synthesis, by contrast, wherein conclusions are deduced from a series of definitions, postulates, and axioms, as in Euclid’s Elements, for instance, follows not the order in which things are discovered, but rather the order that things bear to one another in reality. As such, it can be thought of as the order of being. God, for example, is prior to the human mind in the order of being (since God created the human mind), and so in the synthetic mode of demonstration the existence of God is demonstrated before the existence of the human mind. However, knowledge of one’s own mind precedes knowledge of God, at least in Descartes’ philosophy, and so in the analytic mode of demonstration the cogito is demonstrated before the existence of God. Descartes’ preference is for analysis, because he thinks that it is superior in helping the reader to discover the things for herself, and so in bringing about the intellectual conversion which it is the Meditations’ goal to effectuate in the minds of its readers. According to Descartes, while synthesis, in laying out demonstrations systematically, is useful in preempting dissent, it is inferior in engaging the mind of the reader.

Two primary distinctions can be made in summarizing Descartes’ methodology: (1) the distinction between the order of reasons and the order of subject-matter; and (2) the analysis/synthesis distinction. With respect to the first distinction, the great Continental rationalists are united. All adhere to the order of reasons, as we have described it above, rather than the order of subject-matter. Even though the rationalists disagree about how exactly to interpret the content of the order of reasons, their common commitment to following an order of reasons is a hallmark of their rationalism. Although there are points of convergence with respect to the second, analysis/synthesis distinction, there are also clear points of divergence, and this distinction can be useful in highlighting the range of approaches the rationalists adopt to mathematical methodology.

b. Spinoza

Of the great Continental rationalists, Spinoza is the most closely associated with mathematical method due to the striking presentation of his magnum opus, the Ethics, (as well as his presentation of Descartes’ Principles), in geometrical fashion. The fact that Spinoza is the only major rationalist to present his main work more geometrico might create the impression that he is the only philosopher to employ mathematical method in constructing and elaborating his philosophical system. This impression is mistaken, since both Descartes and Leibniz also apply mathematical method to philosophy. Nevertheless, there are differences between Spinoza’s employment of mathematical method and that of Descartes (and Leibniz). The most striking, of course, is the form of Spinoza’s Ethics. Each part begins with a series of definitions, axioms, and postulates and proceeds thence to deduce propositions, the demonstrations of which refer back to the definitions, axioms, postulates and previously demonstrated propositions on which they depend. Of course, this is just the method of presenting findings that Descartes in the Second Replies dubbed “synthesis.” For Descartes, analysis and synthesis differ only in pedagogical respects: whereas analysis is better for helping the reader discover the truth for herself, synthesis is better in compelling agreement.

There is some evidence that Spinoza’s motivations for employing synthesis were in part pedagogical. In Lodewijk Meyer’s preface to Spinoza’s Principles of Cartesian Philosophy, Meyer uses Descartes’ Second Replies distinction between analysis and synthesis to explain the motivation for the work. Meyer criticizes Descartes’ followers for being too uncritical in their enthusiasm for Descartes’ thought, and attributes this in part to the relative opacity of Descartes’ analytic mode of presentation. Thus, for Meyer, the motivation for presenting Descartes’ Principles in the synthetic manner is to make the proofs more transparent, and thereby leave less excuse for blind acceptance of Descartes’ conclusions. It is not clear to what extent Meyer’s explanation of the mode of presentation of Spinoza’s Principles of Cartesian Philosophy applies to Spinoza’s Ethics. In the first place, although Spinoza approved the preface, he did not author it himself. Secondly, while such an explanation seems especially suited to a work in which Spinoza’s chief goal was to present another philosopher’s thought in a different form, there is no reason to assume that it applies to the presentation of Spinoza’s own philosophy. Scholars have differed on how to interpret the geometrical form of Spinoza’s Ethics. However, it is generally accepted that Spinoza’s use of synthesis does not merely represent a pedagogical preference. There is reason to think that Spinoza’s methodology differs from that of Descartes in a somewhat deeper way.

There is another version of the analysis/synthesis distinction besides Descartes’ that was also influential in the 17^th century, that is, Hobbes’ version of the distinction. Although there is little direct evidence that Spinoza was influenced by Hobbes’ version of the distinction, some scholars have claimed a connection, and, in any case, it is useful to view Spinoza’s methodology in light of the Hobbesian alternative.

Synthesis and analysis are not modes of demonstrating findings that have already been made, for Hobbes, as they are for Descartes, but rather complementary means of generating findings; in particular, they are forms of causal reasoning. For Hobbes, analysis is reasoning from effects to causes; synthesis is reasoning in the other direction, from causes to effects. For example, by analysis, we infer that geometrical objects are constructed via the motions of points and lines and surfaces. Once motion has been established as the principle of geometry, it is then possible, via synthesis, to construct the possible effects of motion, and thereby, to make new discoveries in geometry. According to the Hobbesian schema, then, synthesis is not merely a mode of presenting truths, but a means of generating and discovering truths. (For Hobbes’ method, see The English Works of Thomas Hobbes of Malmesbury, vol. 1, ch. 6.) There is reason to think that synthesis had this kind of significance for Spinoza, as well – as a means of discovery, not merely presentation. Spinoza’s methodology, and, in particular, his theory of definitions, bear this out

Spinoza’s method begins with reflection on the nature of a “given true idea.” The “given true idea” serves as a standard by which the mind learns the distinction between true and false ideas, and also between the intellect and the imagination, and how to direct itself properly in the discovery of true ideas. The correct formulation of definitions emerges as the most important factor in directing the mind properly in the discovery of true ideas. To illustrate his conception of a good definition, Spinoza contrasts two definitions of a circle. On one definition, a circle is a figure in which all the lines from the center to the circumference are equal. On another, a circle is the figure described by the rotation of a line around one of its ends, which is fixed. For Spinoza, the second definition is superior. Whereas the first definition gives only a property of the circle, the second provides the cause from which all of the properties can be deduced. Hence, what makes a definition a good definition, for Spinoza, is its capacity to serve as a basis for the discovery of truths about the thing. The circle, of course, is just an example. For Spinoza, the method is perfected when it arrives at a true idea of the first cause of all things, that is, God. Only the method is perfected with a true idea of God, however, not the philosophy. The philosophy itself begins with a true idea of God, since the philosophy consists in deducing the consequences from a true idea of God. With this in mind, the definition of God is of paramount importance. In correspondence, Spinoza compares contrasting definitions of God, explaining that he chose the one which expresses the efficient cause from which all of the properties of God can be deduced.

In this light, it becomes clear that the geometrical presentation of Spinoza’s philosophy is not merely a pedagogic preference. The definitions that appear at the outset of the five parts of the Ethics do not serve merely to make explicit what might otherwise have remained only implicit in Descartes’ analytic mode of presentation. Rather, key definitions, such as the definition of God, are principles that underwrite the development of the system. As a result, Hobbes’ conception of the analysis/synthesis distinction throws an important light on Spinoza’s procedure. There is a movement of analysis in arriving at the causal definition of God from the preliminary “given true idea.” Then there is a movement of synthesis in deducing consequences from that causal definition. Of course, Descartes’ analysis/synthesis distinction still applies, since, after all, Spinoza’s system is presented in the synthetic manner in the Ethics. But the geometrical style of presentation is not merely a pedagogical device in Spinoza’s case. It is also a clue to the nature of his system.

c. Leibniz

Leibniz is openly critical of Descartes’ distinction between analysis and synthesis, writing, “Those who think that the analytic presentation consists in revealing the origin of a discovery, the synthetic in keeping it concealed, are in error” (L, 233). This comment is aimed at Descartes’ formulation of the distinction in the Second Replies. Leibniz is explicit about his adherence to the viewpoint that seems to be implied by Spinoza’s methodology: synthesis is itself a means of discovering truth no less than analysis, not merely a mode of presentation. Leibniz’s understanding of analysis and synthesis is closer to the Hobbesian conception, which views analysis and synthesis as different directions of causal reasoning: from effects to causes (analysis) and from causes to effects (synthesis). Leibniz formulates the distinction in his own terms as follows:

Synthesis is achieved when we begin from principles and run through truths in good order, thus discovering certain progressions and setting up tables, or sometimes general formulas, in which the answers to emerging questions can later be discovered. Analysis goes back to the principles in order to solve the given problems only […] (L, 232)

Leibniz thus conceives synthesis and analysis in relation to principles.

Leibniz lays great stress on the importance of establishing the possibility of ideas, that is to say, establishing that ideas do not involve contradiction, and this applies a fortiori to first principles. For Leibniz, the Cartesian criterion of clear and distinct perception does not suffice for establishing the possibility of an idea. Leibniz is critical, in particular, of Descartes’ ontological argument on the grounds that Descartes neglects to demonstrate the possibility of the idea of a most perfect being on which the argument depends. It is possible to mistakenly assume that an idea is possible, when in reality it is contradictory. Leibniz gives the example of a wheel turning at the fastest possible rate. It might at first seem that this idea is legitimate, but if a spoke of the wheel were extended beyond the rim, the end of the spoke would move faster than a nail in the rim itself, revealing a contradiction in the original notion.

For Leibniz, there are two ways of establishing the possibility of an idea: by experience (a posteriori) and by reducing concepts via analysis down to a relation of identity (a priori). Leibniz credits mathematicians and geometers with pushing the practice of demonstrating what would otherwise normally be taken for granted the furthest. For example, in Meditations on Knowledge, Truth, and Ideas, Leibniz writes, “That brilliant genius Pascal agrees entirely with these principles when he says, in his famous dissertation on the geometrical spirit […] that it is the task of the geometer to define all terms though ever so little obscure and to prove all truths though little doubtful” (L, 294). Leibniz credits his own doctrine of the possibility of ideas with clarifying exactly what it means for something to be beyond doubt and obscurity.

Leibniz describes the result of the reduction of concepts to identity variously as follows: when the thing is resolved into simple primitive notions understood in themselves (L, 231); “when every ingredient that enters into a distinct concept is itself known distinctly”; “when analysis is carried through to the end” (L, 292). Since, for Leibniz, all true ideas can be reduced to simple identities, it is, in principle, possible to derive all truths via a movement of synthesis from such simple identities in the way that mathematicians produce systems of knowledge on the basis of their basic definitions and axioms. This kind of a priori knowledge of the world is restricted to God, however. According to Leibniz, it is only possible for our finite minds to have this kind of knowledge – which Leibniz calls “intuitive” or “adequate” – in the case of things which do not depend on experience, or what Leibniz also calls “truths of reason,” which include abstract logical and metaphysical truths, and mathematical propositions. In the case of “truths of fact,” by contrast, with the exception of immediately graspable facts of experience, such as, “I think,” and “Various things are thought by me,” we are restricted to formulating hypotheses to explain the phenomena of sensory experience, and such knowledge of the world can, for us, only ever achieve the status of hypothesis, though our hypothetical knowledge can be continually improved and refined. (See Section 5, c, below for a discussion of hypotheses in Leibniz.)

Leibniz is in line with his rationalist predecessors in emphasizing the importance of proper order in philosophizing. Leibniz’s emphasis on establishing the possibility of ideas prior to using them in demonstrating propositions could be understood as a refinement of the geometrical order that Descartes established over against the order of subject-matter. Leibniz emphasizes order in another connection vis-à-vis Locke. As Leibniz makes clear in his New Essays, one of the clearest points of disagreement between him and Locke is on the question of innate ideas. In preliminary comments that Leibniz drew up upon first reading Locke’s Essay, and which he sent to Locke via Burnett, Leibniz makes the following point regarding philosophical order:

Concerning the question whether there are ideas and truths born with us, I do not find it absolutely necessary for the beginnings, nor for the practice of the art of thinking, to answer it; whether they all come to us from outside, or they come from within us, we will reason correctly provided that we keep in mind what I said above, and that we proceed with order and without prejudice. The question of the origin of our ideas and of our maxims is not preliminary in philosophy, and it is necessary to have made great progress in order to resolve it. (Philosophische Schriften, vol. 5, pp. 15-16)

Leibniz’s allusion to what he “said above” refers to remarks regarding the establishment of the possibility of ideas via experience and the principle of identity. This passage makes it clear that, from Leibniz’s point of view, the order in which Locke philosophizes is quite misguided, since Locke begins with a question that should only be addressed after “great progress” has already been made, particularly with respect to the criteria for distinguishing between true and false ideas, and for establishing legitimate philosophical principles. Empiricists generally put much less emphasis on the order of philosophizing, since they do not aim to reason from first principles grasped a priori.

4. A Priori Principles

A fundamental tenet of rationalism – perhaps the fundamental tenet – is that the world is intelligible. The intelligibility tenet means that everything that happens in the world happens in an orderly, lawful, rational manner, and that the mind, in principle, if not always in practice, is able to reproduce the interconnections of things in thought provided that it adheres to certain rules of right reasoning. The intelligibility of the world is sometimes couched in terms of a denial of brute facts, where a “brute fact” is something that “just is the case,” that is, something that obtains without any reason or explanation (even in principle). Many of the a priori principles associated with rationalism can be understood either as versions or implications of the principle of intelligibility. As such, the principle of intelligibility functions as a basic principle of rationalism. It appears under various guises in the great rationalist systems and is used to generate contrasting philosophical systems. Indeed, one of the chief criticisms of rationalism is the fact that its principles can consistently be used to generate contradictory conclusions and systems of thought. The clearest and best known statement of the intelligibility of the world is Leibniz’s principle of sufficient reason. Some scholars have recently emphasized this principle as the key to understanding rationalism (see Della Rocca 2008, chapter 1).

The intelligibility principle raises some classic philosophical problems. Chief among these is a problem of question-begging or circularity. The task of proving that the world is intelligible seems to have to rely on some of the very principles of reasoning in question. In the 17^th century, discussion of this fundamental problem centered around the so-called “Cartesian circle.” The problem is still debated by scholars of 17^th century thought today. The viability of the rationalist enterprise seems to depend, at least in part, on a satisfactory answer to this problem.

a. Intelligibility and the Cartesian Circle

The most important rational principle in Descartes’ philosophy, the principle which does a great deal of the work in generating its details, is the principle according to which whatever is clearly and distinctly perceived to be true is true. This principle means that if we can form any clear and distinct ideas, then we will be able to trust that they accurately represent their objects, and give us certain knowledge of reality. Descartes’ clear and distinct ideas doctrine is central to his conception of the world’s intelligibility, and indeed, it is central to the rationalists’ conception of the world’s intelligibility more broadly. Although Spinoza and Leibniz both work to refine understanding of what it is to have clear and distinct ideas, they both subscribe to the view that the mind, when directed properly, is able to accurately represent certain basic features of reality, such as the nature of substance.

For Descartes, it cannot be taken for granted from the outset that what we clearly and distinctly perceive to be true is in fact true. It is possible to entertain the doubt that an all-powerful deceiving being fashioned the mind so that it is deceived even in those things it perceives clearly and distinctly. Nevertheless, it is only possible to entertain this doubt when we are not having clear and distinct perceptions. When we are perceiving things clearly and distinctly, their truth is undeniable. Moreover, we can use our capacity for clear and distinct perceptions to demonstrate that the mind was not fashioned by an all-powerful deceiving being, but rather by an all-powerful benevolent being who would not fashion us so as to be deceived even when using our minds properly. Having proved the existence of an all-powerful benevolent being qua creator of our minds, we can no longer entertain any doubts regarding our clear and distinct ideas even when we are not presently engaged in clear and distinct perceptions.

Descartes’ legitimation of clear and distinct perception via his proof of a benevolent God raises notorious interpretive challenges. Scholars disagree about how to resolve the problem of the “Cartesian circle.” However, there is general consensus that Descartes’ procedure is not, in fact, guilty of vicious, logical circularity. In order for Descartes’ procedure to avoid circularity, it is generally agreed that in some sense clear and distinct ideas need already to be legitimate before the proof of God’s existence. It is only in another sense that God’s existence legitimates their truth. Scholars disagree on how exactly to understand those different senses, but they generally agree that there is some sense at least in which clear and distinct ideas are self-legitimating, or, otherwise, not in need of legitimation.

That some ideas provide a basic standard of truth is a fundamental tenet of rationalism, and undergirds all the other rationalist principles at work in the construction of rationalist systems of philosophy. For the rationalists, if it cannot be taken for granted in at least some sense from the outset that the mind is capable of discerning the difference between truth and falsehood, then one never gets beyond skepticism.

b. Substance Metaphysics

The Continental rationalists deploy the principle of intelligibility and subordinate rational principles derived from it in generating much of the content of their respective philosophical systems. In no aspect of their systems is the application of rational principles to the generation of philosophical content more evident and more clearly illustrative of contrasting interpretations of these principles than in that for which the Continental rationalists are arguably best known: substance metaphysics.

i. Descartes

Descartes deploys his clear and distinct ideas doctrine in justifying his most well-known metaphysical position: substance dualism. The first step in Descartes’ demonstration of mind-body dualism, or, in his terminology, of a “real” distinction (that is, a distinction between two substances) between mind and body is to show that while it is possible to doubt that one has a body, it is not possible to doubt that one is thinking. As Descartes makes clear in the Principles of Philosophy, one of the chief upshots of his famous cogito argument is the discovery of the distinction between a thinking thing and a corporeal thing. The impossibility of doubting one’s existence is not the impossibility of doubting that one is a human being with a body with arms and legs and a head. It is the impossibility of doubting, rather, that one doubts, perceives, dreams, imagines, understands, wills, denies, and other modalities that Descartes attributes to the thinking thing. It is possible to think of oneself as a thing that thinks, and to recognize that it is impossible to doubt that one thinks, while continuing to doubt that one has a body with arms and legs and a head. So, the cogito drives a preliminary wedge between mind and body.

At this stage of the argument, however, Descartes has simply established that it is possible to conceive of himself as a thinking thing without conceiving of himself as a corporeal thing. It remains possible that, in fact, the thinking thing is identical with a corporeal thing, in other words, that thought is somehow something a body can do; Descartes has yet to establish that the epistemological distinction between his knowledge of his mind and his knowledge of body that results from the hyperbolic doubt translates to a metaphysical or ontological distinction between mind and body. The move from the epistemological distinction to the ontological distinction proceeds via the doctrine of clear and distinct ideas. Having established that whatever he clearly and distinctly perceives is true, Descartes is in a position to affirm the real distinction between mind and body.

In this life, it is never possible to clearly and distinctly perceive a mind actually separate from a body, at least in the case of finite, created minds, because minds and bodies are intimately unified in the composite human being. So Descartes cannot base his proof for the real distinction of mind and body on the clear and distinct perception that mind and body are in fact independently existing things. Rather, Descartes’ argument is based on the joint claims that (1) it is possible to have a clear and distinct idea of thought apart from extension and vice versa; and (2) whatever we can clearly and distinctly understand is capable of being created by God exactly as we clearly and distinctly understand it. Thus, the fact that we can clearly and distinctly understand thought apart from extension and vice versa entails that thinking things and extended things are “really” distinct (in the sense that they are distinct substances separable by God).

The foregoing argument relies on certain background assumptions which it is now necessary to explain, in particular, Descartes’ conception of substance. In the Principles, Descartes defines substance as “a thing which exists in such a way as to depend on no other thing for its existence” (CSM I, 210). Properly speaking, only God can be understood to depend on no other thing, and so only God is a substance in the absolute sense. Nevertheless, Descartes allows that, in a relative sense, created things can count as substances too. A created thing is a substance if the only thing it relies upon for its existence is “the ordinary concurrence of God” (ibid.). Only mind and body qualify as substances in this secondary sense. Everything else is a modification or property of minds and bodies. A second point is that, for Descartes, we do not have a direct knowledge of substance; rather, we come to know substance by virtue of its attributes. Thought and extension are the attributes or properties in virtue of which we come to know thinking and corporeal substance, or “mind” and “body.” This point relies on the application of a key rational principle, to wit, nothingness has no properties. For Descartes, there cannot simply be the properties of thinking and extension without these properties having something in which to inhere. Thinking and extension are not just any properties; Descartes calls them “principal attributes” because they constitute the nature of their respective substances. Other, non-essential properties, cannot be understood without the principal attribute, but the principal attribute can be understood without any of the non-essential properties. For example, motion cannot be understood without extension, but extension can be understood without motion.

Descartes’ conception of mind and body as distinct substances includes some interesting corollaries which result from a characteristic application of rational principles and account for some characteristic doctrinal differences between Descartes and empiricist philosophers. One consequence of Descartes’ conception of the mind as a substance whose principal attribute is thought is that the mind must always be thinking. Since, for Descartes, thinking is something of which the thinker is necessarily aware, Descartes’ commitment to thought as an essential, and therefore, inseparable, property of the mind raises some awkward difficulties. Arnauld, for example, raises one such difficulty in his Objections to Descartes’ Meditations: presumably there is much going on in the mind of an infant in its mother’s womb of which the infant is not aware. In response to this objection, and also in response to another obvious problem, that is, that of dreamless sleep, Descartes insists on a distinction between being aware of or conscious of our thoughts at the time we are thinking them, and remembering them afterwards (CSMK III, 357). The infant is, in fact, aware of its thinking in the mother’s womb, but it is aware only of very confused sensory thoughts of pain and pleasure and heat (not, as Descartes points out, metaphysical matters (CSMK III, 189)) which it does not remember afterwards. Similarly, the mind is always thinking even in the most “dreamless sleep,” it is just that the mind often immediately forgets much of what it had been aware.

Descartes’ commitment to embracing the implications – however counter-intuitive – of his substance-attribute metaphysics, puts him at odds with, for instance, Locke, who mocks the Cartesian doctrine of the always-thinking soul in his An Essay Concerning Human Understanding. For Locke, the question whether the soul is always thinking or not must be decided by experience and not, as Locke says, merely by “hypothesis” (An Essay Concerning Human Understanding, Book II, Chapter 1). The evidence of dreamless sleep makes it obvious, for Locke, that the soul is not always thinking. Because Locke ties personal identity to memory, if the soul were to think while asleep without knowing it, the sleeping man and the waking man would be two different persons.

Descartes’ commitment to the always-thinking mind is a consequence of his commitment to a more basic rational principle. In establishing his conception of thinking substance, Descartes reasons from the attribute of thinking to the substance of thinking on the grounds that nothing has no properties. In this case, he reasons in the other direction, from the substance of thinking, that is, the mind, to the property of thinking on the converse grounds that something must have properties, and the properties it must have are the properties that make it what it is; in the case of the mind, that property is thought. (Leibniz found a way to maintain the integrity of the rational principle without contradicting experience: admit that thinking need not be conscious. This way the mind can still think in a dreamless sleep, and so avoid being without any properties, without any problem about the recollection of awareness.)

Another consequence of Descartes’ substance metaphysics concerns corporeal substance. For Descartes, we do not know corporeal substance directly, but rather through a grasp of its principal attribute, extension. Extension qua property requires a substance in which to inhere because of the rational principle, nothing has no properties. This rational principle leads to another characteristic Cartesian position regarding the material world: the denial of a vacuum. Descartes denies that space can be empty or void. Space has the property of being extended in length, breadth, and depth, and such properties require a substance in which to inhere. Thus, nothing, that is, a void or vacuum, is not able to have such properties because of the rational principle, nothing has no properties. This means that all space is filled with substance, even if it is imperceptible. Once again, Descartes answers a debated philosophical question on the basis of a rational principle.

ii. Spinoza

If Descartes is known for his dualism, Spinoza, of course, is known for monism – the doctrine that there is only one substance. Spinoza’s argument for substance monism (laid out in the first fifteen propositions of the Ethics) has no essential basis in sensory experience; it proceeds through rational argumentation and the deployment of rational principles; although Spinoza provides one a posteriori argument for God’s existence, he makes clear that he presents it only because it is easier to grasp than the a priori arguments, and not because it is in any way necessary.

The crucial step in the argument for substance monism comes in Ethics 1p5: “In Nature there cannot be two or more substances of the same nature or attribute.” It is at this proposition that Descartes (and Leibniz, and many others) would part ways with Spinoza. The most striking and controversial implication of this proposition, at least from a Cartesian perspective, is that human minds cannot qualify as substances, since human minds all share the same nature or attribute, that is, thought. In Spinoza’s philosophy, human minds are actually themselves properties – Spinoza calls them “modes” – of a more basic, infinite substance.

The argument for 1p5 works as follows. If there were two or more distinct substances, there would have to be some way to distinguish between them. There are two possible distinctions to be made: either by a difference in their affections or by a difference in their attributes. For Spinoza, a substance is something which exists in itself and can be conceived through itself; an attribute is “what the intellect perceives of a substance, as constituting its essence” (Ethics 1d4). Spinoza’s conception of attributes is a matter of longstanding scholarly debate, but for present purposes, we can think of it along Cartesian lines. For Descartes, substance is always grasped through a principal property, which is the nature or essence of the substance. Spinoza agrees that an attribute is that through which the mind conceives the nature or essence of substance. With this in mind, if a distinction between two substances were to be made on the basis of a difference in attributes, then there would not be two substances of the same attribute as the proposition indicates. This means that if there were two substances of the same attribute, it would be necessary to distinguish between them on the basis of a difference in modes or affections.

Spinoza conceives of an affection or mode as something which exists in another and needs to be conceived through another. Given this conception of affections, it is impossible, for Spinoza, to distinguish between two substances on the basis of a difference in affections. Doing so would be somewhat akin to affirming that there are two apples on the basis of a difference between two colors, when one apple can quite possibly have a red part and a green part. As color differences do not per se determine differences between apples, in a similar way, modal differences cannot determine a difference between substances – you could just be dealing with one substance bearing multiple different affections. It is notable that in 1p5, Spinoza uses virtually the same substance-attribute schema as Descartes to deny a fundamental feature of Descartes’ system.

Having established 1p5, the next major step in Spinoza’s argument for substance monism is to establish the necessary existence and infinity of substance. For Spinoza, if things have nothing in common with each other, one cannot be the cause of the other. This thesis depends upon assumptions that lie at the heart of Spinoza’s rationalism. Something that has nothing in common with another thing cannot be the cause of the other thing because things that have nothing in common with one another cannot be understood through one another (Ethics 1a5). But, for Spinoza, effects should be able to be understood through causes. Indeed, what it is to understand something, for Spinoza, is to understand its cause. The order of knowledge, provided that the knowledge is genuine, or, as Spinoza says, “adequate,” must map onto the order of being, and vice versa. Thus, Spinoza’s claim that if things have nothing in common with one another, one cannot be the cause of the other, is an expression of Spinoza’s fundamental, rationalist commitment to the intelligibility of the world. Given this assumption, and given the fact that no two substances have anything in common with one another, since no two substances share the same nature or attribute, it follows that if a substance is to exist, it must exist as causa sui (self-caused); in other words, it must pertain to the essence of substance to exist. Moreover, Spinoza thinks that since there is nothing that has anything in common with a given substance, there is therefore nothing to limit the nature of a given substance, and so every substance will necessarily be infinite. This assertion depends on another deep-seated assumption of Spinoza’s philosophy: nothing limits itself, but everything by virtue of its very nature affirms its own nature and existence as much as possible.

At this stage, Spinoza has argued that substances of a single attribute exist necessarily and are necessarily infinite. The last major stage of the argument for substance monism is the transition from multiple substances of a single attribute to only one substance of infinite attributes. Scholars have expressed varying degrees of satisfaction with the lucidity of this transition. It seems to work as follows. It is possible to attribute many attributes to one substance. The more reality or being each thing has, the more attributes belong to it. Therefore, an absolutely infinite being is a being that consists of infinite attributes. Spinoza calls an absolutely infinite being or substance consisting of infinite attributes “God.” Spinoza gives four distinct arguments for God’s existence in Ethics 1p11. The first is commonly interpreted as Spinoza’s version of an ontological argument. It refers back to 1p7 where Spinoza proved that it pertains to the essence of substance to exist. The second argument is relevant to present purposes, since it turns on Spinoza’s version of the principle of sufficient reason: “For each thing there must be assigned a cause, or reason, both for its existence and for its nonexistence” (Ethics 1p11dem). But there can be no reason for God’s nonexistence for the same reasons that all substances are necessarily infinite: there is nothing outside of God that is able to limit Him, and nothing limits itself. Once again, Spinoza’s argument rests upon his assumption that things by nature affirm their own existence. The third argument is a posteriori, and the fourth pivots like the second on the assumption that things by nature affirm their own existence.

Having proven that a being consisting of infinite attributes exists, Spinoza’s argument for substance monism is nearly complete. It remains only to point out that no substance besides God can exist, because if it did, it would have to share at least one of God’s infinite attributes, which, by 1p5, is impossible. Everything that exists, then, is either an attribute or an affection of God.

iii. Leibniz

Leibniz’s universe consists of an infinity of monads or simple substances, and God. For Leibniz, the universe must be composed of monads or simple substances. His justification for this claim is relatively straightforward. There must be simples, because there are compounds, and compounds are just collections of simples. To be simple, for Leibniz, means to be without parts, and thus to be indivisible. For Leibniz, the simples or monads are the “true atoms of nature” (L, 643). However, “material atoms are contrary to reason” (L, 456). Manifold a priori considerations lead Leibniz to reject material atoms. In the first place, the notion of a material atom is contradictory in Leibniz’s view. Matter is extended, and that which is extended is divisible into parts. The very notion of an atom, however, is the notion of something indivisible, lacking parts.

From a different perspective, Leibniz’s dynamical investigations provide another argument against material atoms. Absolute rigidity is included in the notion of a material atom, since any elasticity in the atom could only be accounted for on the basis of parts within the atom shifting their position with respect to each other, which is contrary to the notion of a partless atom. According to Leibniz’s analysis of impact, however, absolute rigidity is shown not to make sense. Consider the rebound of one atom as a result of its collision with another. If the atoms were absolutely rigid, the change in motion resulting from the collision would have to happen instantaneously, or, as Leibniz says, “through a leap or in a moment” (L, 446). The atom would change from initial motion to rest to rebounded motion without passing through any intermediary degrees of motion. Since the body must pass through all the intermediary degrees of motion in transitioning from one state of motion to another, it must not be absolutely rigid, but rather elastic; the analysis of the parts of the body must, in correlation with the degree of motion, proceed to infinity. Leibniz’s dynamical argument against material atoms turns on what he calls the law of continuity, an a priori principle according to which “no change occurs through a leap.”

The true unities, or true atoms of nature, therefore, cannot be material; they must be spiritual or metaphysical substances akin to souls. Since Leibniz’s spiritual substances, or monads, are absolutely simple, without parts, they admit neither of dissolution nor composition. Moreover, there can be no interaction between monads, monads cannot receive impressions or undergo alterations by means of being affected from the outside, since, in Leibniz’s famous phrase from the Monadology, monads “have no windows” (L, 643). Monads must, however, have qualities, otherwise there would be no way to explain the changes we see in things and the diversity of nature. Indeed, following from Leibniz’s principle of the identity of indiscernibles, no two monads can be exactly alike, since each monad stands in a unique relation to the rest, and, for Leibniz, each monad’s relation to the rest is a distinctive feature of its nature. The way in which, for Leibniz, monads can have qualities while remaining simple, or in other words, the only way there can be multitude in simplicity is if monads are characterized and distinguished by means of their perceptions. Leibniz’s universe, in summary, consists in monads, simple spiritual substances, characterized and distinguished from one another by a unique series of perceptions determined by each monad’s unique relationship vis-à-vis the others.

Of the great rationalists, Leibniz is the most explicit about the principles of reasoning that govern his thought. Leibniz singles out two, in particular, as the most fundamental rational principles of his philosophy: the principle of contradiction and the principle of sufficient reason. According to the principle of contradiction, whatever involves a contradiction is false. According to the principle of sufficient reason, there is no fact or true proposition “without there being a sufficient reason for its being so and not otherwise” (L, 646). Corresponding to these two principles of reasoning are two kinds of truths: truths of reasoning and truths of fact. For Leibniz, truths of reasoning are necessary, and their opposite is impossible. Truths of fact, by contrast, are contingent, and their opposite is possible. Truths of reasoning are by most commentators associated with the principle of contradiction because they can be reduced via analysis to a relation between two primitive ideas, whose identity is intuitively evident. Thus, it is possible to grasp why it is impossible for truths of reasoning to be otherwise. However, this kind of resolution is only possible in the case of abstract propositions, such as the propositions of mathematics (see Section 3, c, above). Contingent truths, or truths of fact, by contrast, such as “Caesar crossed the Rubicon,” to use the example Leibniz gives in the Discourse on Metaphysics, are infinitely complicated. Although, for Leibniz, every predicate is contained in its subject, to reduce the relationship between Caesar’s “notion” and his action of crossing the Rubicon would require an infinite analysis impossible for finite minds. “Caesar crossed the Rubicon” is a contingent proposition, because there is another possible world in which Caesar did not cross the Rubicon. To understand the reason for Caesar’s crossing, then, entails understanding why this world exists rather than any other possible world. It is for this reason that contingent truths are associated with the principle of sufficient reason. Although the opposite of truths of fact is possible, there is nevertheless a sufficient reason why the fact is so and not otherwise, even though this reason cannot be known by finite minds.

Truths of fact, then, need to be explained; there must be a sufficient reason for them. However, according to Leibniz, “a sufficient reason for existence cannot be found merely in any one individual thing or even in the whole aggregate and series of things” (L, 486). That is to say, the sufficient reason for any given contingent fact cannot be found within the world of which it is a part. The sufficient reason must explain why this world exists rather than another possible world, and this reason must lie outside the world itself. For Leibniz, the ultimate reason for things must be contained in a necessary substance that creates the world, that is, God. But if the existence of God is to ground the series of contingent facts that make up the world, there must be a sufficient reason why God created this world rather than any of the other infinite possible worlds contained in his understanding. As a perfect being, God would only have chosen to bring this world into existence rather than any other because it is the best of all possible worlds. God’s choice, therefore, is governed by the principle of fitness, or what Leibniz also calls the “principle of the best” (L, 647). The best world, according to Leibniz, is the one which maximizes perfection; and the most perfect world is the one which balances the greatest possible variety with the greatest possible order. God achieves maximal perfection in the world through what Leibniz calls “the pre-established harmony.” Although the world is made up of an infinity of monads with no direct interaction with one another, God harmonizes the perceptions of each monad with the perceptions of every other monad, such that each monad represents a unique perspective on the rest of the universe according to its position vis-à-vis the others.

According to Leibniz’s philosophy, in the case of all true propositions, the predicate is contained in the subject. This is often known as the “predicate-in-notion” principle. The relationship between predicate and subject can only be reduced to an identity relation in the case of truths of reason, whereas in the case of truths of fact, the reduction requires an infinite analysis. Nevertheless, in both cases, it is possible in principle (which is to say, for an infinite intellect) to know everything that will ever happen to an individual substance, and even everything that will happen in the world of an individual substance on the basis of an examination of the individual substance’s notion, since each substance is an expression of the entire world. Leibniz’s predicate-in-notion principle therefore unifies both of his two great principles of reasoning – the principle of contradiction and the principle of sufficient reason – since the relation between predicate and subject is either such that it is impossible for it to be otherwise or such that there is a sufficient reason why it is as it is and not otherwise. Moreover, it represents a particularly robust expression of the principle of intelligibility at the very heart of Leibniz’s system. There is a reason why everything is as it is, whether that reason is subject to finite or only to infinite analysis.

5. Continental Rationalism, Experience, and Experiment

Rationalism is often criticized for placing too much confidence in the ability of reason alone to know the world. The extent to which one finds this criticism justified depends largely on one’s view of reason. For Hume, for instance, knowledge of the world of “matters of fact” is gained exclusively through experience; reason is merely a faculty for comparing ideas gained through experience; it is thus parasitic upon experience, and has no claim whatsoever to grasp anything about the world itself, let alone any special claim. For Kant, reason is a mental faculty with an inherent tendency to transgress the bounds of possible experience in an effort to grasp the metaphysical foundations of the phenomenal realm. Since knowledge of the world is limited to objects of possible experience, for Kant, reason, with its delusions of grasping reality beyond those limits, must be subject to critique.

Sometimes rationalism is charged with neglecting or undervaluing experience, and with embarrassingly having no means of accounting for the tremendous success of the experimental sciences. While the criticism of the confidence placed in reason may be defensible given a certain conception of reason (which may or may not itself be ultimately defensible), the latter charge of neglecting experience is not; more often than not it is the product of a false caricature of rationalism

Descartes and Leibniz were the leading mathematicians of their day, and stood at the forefront of science. While Spinoza distinguished himself more as a political thinker, and as an interpreter of scripture (albeit a notorious one) than as a mathematician, Spinoza too performed experiments, kept abreast of the leading science of the day, and was renowned as an expert craftsman of lenses. Far from neglecting experience, the great rationalists had, in general, a sophisticated understanding of the role of experience and, indeed, of experiment, in the acquisition and development of knowledge. The fact that the rationalists held that experience and experiment cannot serve as foundations for knowledge, but must be fitted within, and interpreted in light of, a rational epistemic framework, should not be confused with a neglect of experience and experiment.

a. Descartes

One of the stated purposes of Descartes’ Meditations, and, in particular, the hyperbolic doubts with which it commences, is to reveal to the mind of the reader the limitations of its reliance on the senses, which Descartes regards as an inadequate foundation for knowledge. By leading the mind away from the senses, which often deceive, and which yield only confused ideas, Descartes prepares the reader to discover the clear and distinct perceptions of the pure intellect, which provide a proper foundation for genuine knowledge. Nevertheless, empirical observations and experimentation clearly had an important role to play in Descartes’ natural philosophy, as evidenced by his own private empirical and experimental research, especially in optics and anatomy, and by his explicit statements in several writings on the role and importance of observation and experiment.

In Part 6 of the Discourse on the Method, Descartes makes an open plea for assistance – both financial and otherwise – in making systematic empirical observations and conducting experiments. Also in Discourse Part 6, Descartes lays out his program for developing knowledge of nature. It begins with the discovery of “certain seeds of truth” implanted naturally in our souls (CSM I, 144). From them, Descartes seeks to derive the first principles and causes of everything. Descartes’ Meditations illustrates these first stages of the program. By “seeds of truth” Descartes has in mind certain intuitions, including the ideas of thinking, and extension, and, in particular, of God. On the basis of clearly and distinctly perceiving the distinction between what belongs properly to extension (figure, position, motion) and what does not (colors, sounds, smells, and so forth), Descartes discovers the principles of physics, including the laws of motion. From these principles, it is possible to deduce many particular ways in which the details of the world might be, only a small fraction of which represent the way the world actually is. It is as a result of the distance, as it were, between physical principles and laws of nature, on one hand, and the particular details of the world, on the other, that, for Descartes, observations and experiments become necessary.

Descartes is ambivalent about the relationship between physical principles and particulars, and about the role that observation and experiment play in mediating this relationship. On the one hand, Descartes expresses commitment to the ideal of a science deduced with certainty from intuitively grasped first principles. Because of the great variety of mutually incompatible consequences that can be derived from physical principles, observation and experiment are required even in the ideal deductive science to discriminate between actual consequences and merely possible ones. According to the ideal of deductive science, however, observation and experiment should be used only to facilitate the deduction of effects from first causes, and not as a basis for an inference to possible explanations of natural phenomena, as Descartes makes clear at one point his Principles of Philosophy (CSM I, 249). If the explanations were only possible, or hypothetical, the science could not lay claim to certainty per the deductive ideal, but merely to probability.

On the other hand, Descartes states explicitly at another point in the Principles of Philosophy that the explanations provided of such phenomena as the motion of celestial bodies and the nature of the earth’s elements should be regarded merely as hypotheses arrived at on the basis of a posteriori reasoning (CSM I, 255); while Descartes says that such hypotheses must agree with observation and facilitate predictions, they need not in fact reflect the actual causes of phenomena. Descartes appears to concede, albeit reluctantly, that when it comes to explaining particular phenomena, hypothetical explanations and moral certainty (that is, mere probability) are all that can be hoped for.

Scholars have offered a range of explanations for the inconsistency in Descartes’ writings on the question of the relation between first principles and particulars. It has been suggested that the inconsistency within the Principles of Philosophy reflects different stages of its composition (see Garber 1978). However the inconsistency might be explained, it is clear that Descartes did not take it for granted that the ideal of a deductive science of nature could be realized. Moreover, whether or not Descartes ultimately believed the ideal of deductive science was realizable, he was unambiguous on the importance of observation and experiment in bridging the distance between physical principles and particular phenomena. (For further discussion, see René Descartes: Scientific Method.)

b. Spinoza

The one work that Spinoza published under his own name in his lifetime was his geometrical reworking of Descartes’ Principles of Philosophy. In Spinoza’s presentation of the opening sections of Part 3 of Descartes’ Principles, Spinoza puts a strong emphasis on the hypothetical nature of the explanations of natural phenomena in Part 3. Given the hesitance and ambivalence with which Descartes concedes the hypothetical nature of his explanations in his Principles, Spinoza’s unequivocal insistence on hypotheses is striking. Elsewhere Spinoza endorses hypotheses more directly. In the Treatise on the Emendation of the Intellect, Spinoza describes forming the concept of a sphere by affirming the rotation of a semicircle in thought. He points out that this idea is a true idea of a sphere even if no sphere has ever been produced this way in nature (The Collected Works of Spinoza, Vol. 1, p. 32). Spinoza’s view of hypotheses relates to his conception of good definitions (see Section 3, b, above). If the cause through which one conceives something allows for the deduction of all possible effects, then the cause is an adequate one, and there is no need to fear a false hypothesis. Spinoza appears to differ from Descartes in thinking that the formation of hypotheses, if done properly, is consistent with deductive certainty, and not tantamount to mere probability or moral certainty.

Again in the Treatise on the Emendation of the Intellect, Spinoza speaks in Baconian fashion of identifying “aids” that can assist in the use of the senses and in conducting orderly experiments. Unfortunately, Spinoza’s comments regarding “aids” are very unclear. This is perhaps explained by the fact that they appear in a work that Spinoza never finished. Nevertheless, it does seem clear that although Spinoza, like Descartes, emphasized the importance of discovering proper principles from which to deduce knowledge of everything else, he was no less aware than Descartes of the need to proceed via observation and experiment in descending from such principles to particulars. At the same time, given his analysis of the inadequacy of sensory images, the collection of empirical data must be governed by rules and rational guidelines the details of which it does not seem that Spinoza ever worked out.

A valuable perspective on Spinoza’s attitude toward experimentation is provided by Letter 6, which Spinoza wrote to Oldenburg with comments on Robert Boyle’s experimental research. Among other matters, at issue is Boyle’s “redintegration” (or reconstitution) of niter (potassium nitrate). By heating niter with a burning coal, Boyle separated the niter into a “fixed” part and a volatile part; he then proceeded to distill the volatile part, and recombine it with the fixed part, thereby redintegrating the niter. Boyle’s aim was to show that the nature of niter is not determined by a Scholastic “substantial form,” but rather by the composition of parts, whose secondary qualities (color, taste, smell, and so forth) are determined by primary qualities (size, position, motion, and so forth). While taking no issue with Boyle’s attempt to undermine the Scholastic analysis of physical natures, Spinoza criticized Boyle’s interpretation of the experiment, arguing that the fixed niter was merely an impurity left over, and that there was no difference between the niter and the volatile part other than a difference of state.

Two things stand out from Spinoza’s comments on Boyle. On the one hand, Spinoza exhibits a degree of impatience with Boyle’s experiments, charging some of them with superfluity on the grounds either that what they show is evident on the basis of reason alone, or that previous philosophers have already sufficiently demonstrated them experimentally. In addition, Spinoza’s own interpretation of Boyle’s experiment is primarily based in a rather speculative, Cartesian account of the mechanical constitution of niter (as Boyle himself points out in response to Spinoza). On the other hand, Spinoza appears eager to show his own fluency with experimental practice, describing no fewer than three different experiments of his own invention to support his interpretation of the redintegration. What Spinoza is critical of is not so much Boyle’s use of experiment per se as his relative neglect of relevant rational considerations. For instance, Spinoza at one point criticizes Boyle for trying to show that secondary qualities depend on primary qualities on experimental grounds. Spinoza thought the proposition needed to be demonstrated on rational grounds. While Spinoza acknowledges the importance and necessity of observation and experiment, his emphasis and focus is on the rational framework needed for making sense of experimental findings, without which the results are confused and misleading.

c. Leibniz

In principle, Leibniz thinks it is not impossible to discover the interior constitution of bodies a priori on the basis of a knowledge of God and the “principle of the best” according to which He creates the world. Leibniz sometimes remarks that angels could explain to us the intelligible causes through which all things come about, but he seems conflicted over whether such understanding is actually possible for human beings. Leibniz seems to think that while the a priori pathway should be pursued in this life by the brightest minds in any case, its perfection will only be possible in the afterlife. The obstacle to an a priori conception of things is the complexity of sensible effects. In this life, then, knowledge of nature cannot be purely a priori, but depends on observation and experimentation in conjunction with reason

Apart from perception, we have clear and distinct ideas only of magnitude, figure, motion, and other such quantifiable attributes (primary qualities). The goal of all empirical research must be to resolve phenomena (including secondary qualities) into such distinctly perceived, quantifiable notions. For example, heat is explained in terms of some particular motion of air or some other fluid. Only in this way can the epistemic ideal be achieved of understanding how phenomena follow from their causes in the same way that we know how the hammer stroke after a period of time follows from the workings of a clock (L, 173). To this end, experiments must be carried out to indicate possible relationships between secondary qualities and primary qualities, and to provide a basis for the formulation of hypotheses to explain the phenomena.

Nevertheless, there is an inherent limitation to this procedure. Leibniz explains that if there were people who had no direct experience of heat, for instance, even if someone were to explain to them the precise mechanical cause of heat, they would still not be able to know the sensation of heat, because they would still not distinctly grasp the connection between bodily motion and perception (L, 285). Leibniz seems to think that human beings will never be able to bridge the explanatory gap between sensations and mechanical causes. There will always be an irreducibly confused aspect of sensible ideas, even if they can be associated with a high degree of sophistication with distinctly perceivable, quantifiable notions. However, this limitation does not mean, for Leibniz, that there is any futility in human efforts to understand the world scientifically. In the first place, experimental knowledge of the composition of things is tremendously useful in practice, even if the composition is not distinctly perceived in all its parts. As Leibniz points out, the architect who uses stones to erect a cathedral need not possess a distinct knowledge of the bits of earth interposed between the stones (L, 175). Secondly, even if our understanding of the causes of sensible effects must remain forever hypothetical, the hypotheses themselves can be more or less refined, and it is proper experimentation that assists in their refinement.

6. References and Further Reading

When citing the works of Descartes, the three volume English translation by Cottingham, Stoothoff, Murdoch, and Kenny was used. For the original language, the edition by Adam and Tannery was consulted.

When citing Spinoza’s Ethics, the translation by Curley in A Spinoza Reader was used. The following system of abbreviation was used when citing passages from the Ethics: the first number designates the part of the Ethics (1-5); then, “p” is for proposition, “d” for definition, “a” for axiom, “dem” for demonstration, “c” for corollary, and “s” for scholium. So, 1p17s refers to the scholium of the seventeenth proposition of the first part of the Ethics. For the original language, the edition by Gebhardt was consulted.

For the original language in Leibniz, the edition by Gerhardt was consulted.

a. Primary Sources

Bacon, Francis. The Works of Francis Bacon. 7 Volumes. Edited by J. Spedding, R. L. Ellis, and D.D. Heath. London: Longmans, 1857-70. Cited above as Spedding, volume, page.
Bacon, Francis. The New Organon. Edited by Lisa Jardine and Michael Silverthorne. Cambridge, UK: Cambridge University Press, 2000.
Descartes, René. Oeuvres de Descartes. 12 Volumes. Edited by C. Adam and P. Tannery. Paris: J. Vrin, 1964-76.
Descartes, René. The Philosophical Writings of Descartes. 3 vols. Vols. 1 and 2 translated by John Cottingham, Robert Stoothoff, and Dugald Murdoch. Vol. 3 translated by John Cottingham, Robert Stoothoff, Dugald Murdoch, and Anthony Kenny. Cambridge, UK: Cambridge University Press, 1984-91. Cited above as CSM or CSMK, volume, page.
Hegel, G.W.F. Werke in zwanzig Bänden: Vorlesungen über die Geschichte der Philosophie: Werke XX. Edited by Eva Moldenhauer and Karl Markus Michel. Frankfurt am Main: Suhrkamp Verlag, 1986. Cited as Werke, volume, page.
Hobbes, Thomas. The English Works of Thomas Hobbes of Malmesbury, Volume 1. London: John Bohn, 1839.
Leibniz, G.W. Philosophische Schriften. 7 Volumes. Edited by C.I. Gerhardt. Berlin, 1875-90.
Leibniz, G.W. Philosophical Papers and Letters. Second Edition. Translated and edited by Leroy E. Loemker. Dordrecht: Kluwer, 1989. Cited above as L, page.
Leibniz, G.W. New Essays on Human Understanding. Translated and edited by Peter Remnant and Jonathan Bennett. Cambridge, UK: Cambridge University Press, 1996. Cited above as NE, page.
Locke, John. An Essay Concerning Human Understanding. Edited by Peter H. Nidditch. Oxford, UK: Oxford University Press, 1979.
Malebranche, Nicholas. The Search after Truth. Translated and edited by Thomas M. Lennon and Paul J. Olscamp. Cambridge, UK: Cambridge University Press, 1997.
Spinoza, Benedict de. Spinoza Opera. 4 Volumes. Edited by C. Gebhardt. Heidelberg: Carl Winter, 1925.
Spinoza, Benedict de. The Collected Works of Spinoza. Vol. 1. Edited and translated by Edwin Curley. Princeton, NJ: Princeton University Press, 1985.
Spinoza, Benedict de. A Spinoza Reader: The Ethics and Other Works. Edited and translated by Edwin Curley. Princeton, NJ: Princeton University Press, 1994.
Spinoza, Benedict de. Spinoza: Complete Works. Translated by Samuel Shirley and edited by Michael L. Morgan. Indianapolis, IN: Hackett, 2002.

b. Secondary Sources

Ayers, Michael (ed.). Rationalism, Platonism and God. Oxford, UK: Oxford University Press, 2007.
Biasutti, Franco. “Reason and Experience in Leibniz and Spinoza” in Studia Spinozana, Volume 6, Spinoza and Leibniz (1990): 45-71.
Cottingham, John. The Rationalists. Oxford, UK: Oxford University Press, 1988.
Della Rocca, Michael. Spinoza. London: Routledge, 2008.
Fraenkel, Carlos; Perinetti, Dario; Smith, Justin E.H. (eds.). The Rationalists: Between Tradition and Innovation. Dordrecht: Springer, 2011.
Gabbey, Alan. “Spinoza’s natural science and methodology” in The Cambridge Companion to Spinoza, edited by Don Garrett. Cambridge, UK: Cambridge University Press, 1996.
Garber, Daniel. “Science and Certainty in Descartes” in Descartes: Critical and Interpretive Essays, edited by Michael Hooker. Baltimore, MD: The Johns Hopkins University Press, 1978.
Garber, Daniel. Descartes Embodied: Reading Cartesian Philosophy through Cartesian Science. Cambridge, UK: Cambridge University Press, 2001.
Huenemann, Charlie. Understanding Rationalism. Durham, UK: Acumen Publishing, 2008.
Leduc, Christian. “Leibniz and Sensible Qualities” in British Journal for the History of Philosophy. 18(5), 2010: 797-819.
Nelson, Alan (ed.). A Companion to Rationalism. Oxford, UK: Blackwell, 2005.
Pereboom, Derk (ed.). The Rationalists: Critical Essays on Descartes, Spinoza, and Leibniz. Rowman & Littlefield, 1999.
Phemister, Pauline. The Rationalists: Descartes, Spinoza, and Leibniz. Polity, 2006.
Wilson, Margaret Dauler. Ideas and Mechanism: Essays on Early Modern Philosophy. Princeton, NJ: Princeton University Press, 1999.

Author Information

Matthew Homan
Email: matthew.homan@cnu.edu
Christopher Newport University
U. S. A.

Knowledge Norms

Epistemology has seen a surge of interest in the idea that knowledge provides a normative constraint or rule governing certain actions or mental states. Such interest is generated in part by noticing that fundamentally epistemic notions, such as belief, evidence, and justification, figure prominently not only in theorizing about knowledge, but also in our everyday evaluations of each others’ actions, reasoning, and doxastic commitments. The three most prominent proposals to emerge from the epistemology literature have been that knowledge is the norm of assertion, the norm of action, and the norm of belief, though we shall consider other proposals as well.

‘Norm’ here is often, but not always, understood as a rule which is intimately related to the action/mental state type in question, such that this relationship is a constitutive one: the action or mental state is constituted (in part) by its relationship to the rule. Typically such views argue for a norm of permission such that knowledge is required, as a necessary condition, for permissibly acting or being in the relevant mental state: in schematic form, one must: X only if one knows a relevantly specified proposition. Some philosophers also endorse a sufficiency condition as well, so that knowledge is necessary and sufficient for (epistemic) permission to X, such that one must: X if and only if one knows a relevantly specific proposition. Such views put knowledge to work in elucidating normative concepts, practical rationality, and conceptual priorities in epistemology, mind, and decision theory. This article outlines the growing literature on these topics.

Knowledge Norm of Assertion
Knowledge Norm of Action
Knowledge Norm of Belief
1. The Belief-Assertion Parallel
2. Knowledge Disagreement Norm
References and Further Reading

1. Knowledge Norm of Assertion

Assertion is the speech act we use to make claims about the way things are: in English, asserting is the default speech act for uttering a sentence in the indicative or declarative mood, such as when one tells someone, “John is in his office” (for an overview of assertion, including ways of characterizing it that do not make essential appeal to epistemic norms, see MacFarlane 2011). The recent literature on the norms of assertion has concentrated on whether there is a rule governing the speech act of assertion which specifies a necessary condition for making the speech act permissible on that occasion; section 1.D below briefly discusses the idea of a sufficient condition for permissible assertion. The view has its roots in the work of philosophers who argued that when one asserts, claims, or declares that p (which are to be distinguished from simply uttering “p”) one somehow thereby represents oneself as knowing that p, even though p itself may not refer to the speaker’s knowledge at all (see Moore 1962: 277; Moore 1993: 211; Black 1952; and Unger 1975: 251ff.). The idea that when one asserts that p one represents oneself as knowing that p—call this position ‘RK’—enabled an explanation of certain problem sentences and conversational patterns.

a. Problem Sentences: Moore’s Paradox

G.E. Moore noted the paradoxical nature of asserted conjunctions where one affirms a proposition but also denies that one believes it or that one knows it. Conjunctions such as (1) and (2), he said, sound “absurd” (Moore 1942: 542-43; 1962: 277):

(1) Dogs bark, but I don’t believe that they do.

(2) Dogs bark, but I don’t know that they do.

The order of the conjuncts does not matter to their absurdity, as (3) and (4) make clear (Moore 1993: 207):

(3) I don’t believe that dogs bark, but they do.

(4) I don’t know that [whether] dogs bark, but they do.

What captured Moore’s interest about such asserted sentences is that they could be true and yet it seems incoherent to state that truth: “It is a paradox that it should be perfectly absurd to utter assertively words of which the meaning is something which may quite well be true—is not a contradiction” (Moore 1993: 209). Moore’s own diagnosis of their absurdity appeals to something like RK, namely that “by asserting p positively, you imply, though you don’t assert, that you know that p” (1962: 277). So in asserting one of (1) – (4), one asserts, in one conjunct, a proposition and thereby also represents oneself as knowing it; but one also denies, in the other conjunct, that one knows it (or believes it, entailed by knowing it), thus generating a contradiction between what one claims (that one doesn’t know) and what one represents as being the case (that one does know).

b. Conversational Patterns

Peter Unger (1975) pointed to certain conversational patterns which seem to support RK, because RK well-explains them. One of these concerns the common use of the question “How do you know?” in response to someone’s assertion: such a question may be used to elicit clarification about why one is flat-out asserting, but importantly, it also may be used to challenge someone’s assertion. What is more, it is rare that this question is condemned as out of line in response to an assertion. Such questions are appropriate even though an asserter has said nothing at all about knowing what she’s asserted, and an asserter cannot acceptably answer such questions by claiming that she never claimed that she knew it. And an asserter who concedes with “I don’t know,” or modifies her original assertion by moving to “I believe p” or “I think p” or “Probably p” will normally be taken to be retreating from her original outright assertion that p: she has instead replaced her claim with a weaker one. RK explains all these points (Unger 1975: 263-64; cf. also Slote 1979).

Timothy Williamson (1996; 2000, Ch. 11), provides a fuller defense of the view, and pointed to further conversational patterns explained by RK. Williamson’s account replaces RK with the Knowledge Norm of Assertion, sometimes called the ‘Knowledge Account of Assertion’, which says that

(KNA) One must: assert that p only if one knows that p

KNA can be thought of “as giving the condition on which a speaker has the authority to make an assertion. Thus asserting p without knowing p is doing something without having the authority to do it, like giving someone a command without having the authority to do so” (2000: 257). Williamson thinks of KNA as constitutive of the speech act of assertion, conceived of by analogy with the rules of a game: just as the rules of chess are essential to it in that they constitute what the game is and what it is to play chess, so Williamson thinks of the speech act of assertion as constituted by its relation to KNA. In this sense, mastering the speech act of assertion involves implicitly grasping this norm and the practice which it governs (2000: 241); indeed, the speech act plausibly functions to express one’s knowledge (Turri 2011). If this is correct, KNA would explain RK, a descriptive fact about what speakers who assert represent about themselves: for it is in virtue of engaging in a practice whose norm we all implicitly grasp that one would represent oneself as conforming to that norm. (For helpful discussion of Williamson’s approach to constitutivity, see Turri 2014a; for an account on which the KNA is derived from a more fundamental norm of intellectual flourishing, see Brogaard 2014.)

Williamson notes that in addition to the “How do you know?” question which can be used to implicitly challenge one’s authority to assert, the stronger challenge question “Do you know that?” explicitly challenges one’s authority, and the dismissal “You don’t know that!” rejects one’s authority. KNA explains this range of aggressiveness (Williamson 2000: 253; 2009: 344). Turri (2010) further points out that there is an asymmetry between the acceptability of certain kinds of prompts to assertion:

(5) Do you know whether p?

(6) Is p?

are typically interchangeable as prompts to an assertion, and the flat-out assertion “p” serves to answer each equally well; but certain stronger questions, such as “Are you certain that p?” typically cannot be used, as can (5) and (6), as an initial prompt for assertion, whereas weaker prompts such as “Do you think that p?” or “Do you have any idea whether p?” seem to request something weaker than a flat-out assertion (perhaps a hedged assertion or a prediction), are thereby not interchangeable with (5) and (6). Related to this data is that a standard response when one feels not well-positioned to assert, in reply to a prompt like (6), is to answer “I don’t know.” The appropriateness of the “I don’t know” response is telling given that the query was about p, not about whether one knows that p. Thus KNA seems confirmed by these data.

In addition to prompts and challenges, and our responses to them, there is data from lottery assertions (Williamson 2000: 246-252, Hawthorne 2004: 21-23). Many people find it somehow inappropriate for people to flat-out assert of a particular lottery ticket (before the draw has been announced) that it will lose, even though given a large enough lottery its losing is overwhelmingly probable. Many also find it plausible that one does not know that such a ticket will lose. KNA proponents aim to explain the first point in terms of the second: the reason it is inappropriate for one to make such lottery assertions, absent special knowledge about the lottery being rigged, is that one does not know that the ticket will lose.

Benton (2011) and Blaauw (2012) also point to peculiar facts about the parenthetical positioning of “I know” in assertive sentences, which seem well-explained by KNA. Notice that “I believe” (or “I think,” or “probably”) can occur in assertive constructions to hedge one’s assertion, and syntactically they can occur in prefaced or utterance-initial position (7), parenthetical position (8), or utterance-final parenthetical position (9), with each sounding as good as the other:

(7) I believe that John is in his office.

(8) John is, I believe, in his office.

(9) John is in his office, I believe.

Yet with “I know,” (10) sounds perfectly in order, but (11) and (12), while coherent, can seem oddly redundant:

(10) I know that John is in his office.

(11) ? John is, I know, in his office.

(12) ? John is in his office, I know.

KNA is able to explain why: if flat-out assertions express one’s knowledge, or represent one as knowing, it will be expressively redundant to add to it that one knows (where (10) is not redundant because it seems to be the amplified claim that: one knows that John’s in his office). However this explanatory argument from KNA of such data has been critiqued as incomplete or inadequate (see McKinnon & Turri 2013, McGlynn 2014).

Finally, knowledge seems to be connected to assertion in parallel with its connection to showing someone how to do something: in the same way that knowing that p seems to be required for permissibly asserting that p, knowing how to X seems to be required for permissibly showing someone how to X. In this sense, knowing is the pedagogical norm of showing, for structurally parallel considerations to the linguistic data discussed above (Moorean conjunctions, challenges, prompts, etc.) is available for pedagogical contexts (Buckwalter & Turri 2014).

In short, KNA claims to offer the best explanation of these data from Moorean conjunctions, challenges, prompts, responses to prompts, lottery assertions, parenthetical positioning, and pedagogical norms.

c. Rivals and Objections

Though KNA has been widely defended, its opponents offer substantial criticism and suggest rival accounts requiring other epistemic or alethic conditions: most rival norms of assertion appeal to justified or reasonable or well-supported belief, or that it be reasonable or credible for one to believe, or that one’s assertion be true.

Williamson (2000: 244-249) considered a Truth Norm to be the most significant rival to KNA. Because knowledge is factive, the KNA requires its assertions to be true; but according to the Truth Norm, one must assert that p only if p is true (a further norm requiring evidence for p would be derivable from the requirement of truth), and thus is less demanding than the KNA. Weiner (2005) argues for a Truth Norm by noting that cases of prediction and retrodiction seem to be counterexamples to KNA, that is, they are assertions which seem intuitively acceptable even though the propositions affirmed are not known. Weiner further argues that the Truth Norm can rely on Gricean pragmatic resources to explain the data from lotteries and Moorean conjunctions, for the Truth Norm on its own does not predict the inappropriateness of such assertions. While Weiner (2005) and Whiting (2013) argue for truth as necessary and sufficient for the epistemic propriety of assertion, Littlejohn (2012) and Turri (2013b) argue (compatibly with the KNA) that truth is necessary for epistemically proper assertion; Littlejohn’s defense of factivity focuses on the requirement that assertions about what a subject ought to do would have to satisfy the truth requirement to be properly asserted, whereas Turri’s draws on experimental investigation of people’s judgments of false assertions. For criticisms of Weiner’s Truth Norm, see Pelling (2011) and Benton (2012). A related norm is that proposed by Maitra and Weatherson (2010): they argue that a certain class of statements, namely those concerned with what is “the thing for one to do,” form an important exception to the KNA. Their rival norm, the Action Rule, says “Assert that p only if acting as if p is true is the thing for you to do” (2010: 114). They argue that their Action Rule collapses into the Truth Norm for propositions concerning what one should do (“if an agent should do X, then that agent is in a position to say that they should do X,” 2010: 100), though it does not do so for other propositions.

Douven (2006) argues for a Rational Credibility Norm, and Lackey (2007) argues for a Reasonable-to-Believe Norm; for related norms, see also McKinnon’s (2013) Supportive Reasons Norm. These views roughly hold that to be epistemically acceptable, an assertion that p need not be known, but must be credible or reasonable for the speaker to believe, even if it is not actually believed by the speaker. Douven’s approach argues that his norm is as adequate as the KNA in explaining most of the linguistic data canvassed above, but that his Rational Credibility norm is a priori simpler than, and so preferable to, the KNA (cf. Douven 2009 which updates some of his arguments). Lackey’s influential discussion argues for this view by suggesting that cases of selfless assertion are intuitively acceptable. Selfless assertions involve cases in which an asserter possesses knowledge-worthy evidence, appreciates the strength of that evidence, yet for non-epistemic reasons fails to believe that p (and asserts that p anyway). Thus on Lackey’s particular account, the speaker need not even believe what is asserted (for criticism of Lackey’s view, see Turri 2014b). Because these norms sanction lottery assertions and Moorean assertions, Douven and Lackey both attempt to explain away the impropriety attending to such assertions.

Kvanvig (2009, 2011a) argues for a Justified Belief Norm; somewhat related is Neta’s (2009) Justified-Belief-that-One-Knows Norm. These norms require, for permissible assertion, a justified belief of some kind, either that the asserter justifiably believe what is asserted, perhaps even knowledge-level justification; or that the asserter hold the higher order justified belief that she knows what she’s asserted (the latter of which will, on many views, itself require that she justifiably believe the asserted proposition). These norms do not actually require an assertion to be true, and thus their proponents have to explain the apparent defect in a false assertion, even if one is largely absolved from blame given that that one was justified in believing what was asserted (for discussion see Williamson 2009: 345). Similarly, Coffman (2014) argues for a Would-Be Knowledge Norm, which is stronger than a justified belief norm in that it requires not only knowledge-level justification, but also that the belief not be Gettiered. This norm also, however, does not require truth, for one might have a false belief which (given one’s knowledge-level justification) would be knowledge if only it were true.

Another rival approach is a context-sensitive norm of assertion which accepts that an epistemic norm governs assertion, but claims that its content can vary according to context. There are different ways of formulating such an account. On Gerken’s (2012) view, the epistemic norm of a central type of assertion is an internalist norm of “Discursive Justification,” according to which an asserter must be able to articulate reasons for her belief in the proposition asserted. This approach is context-sensitive in that what counts as adequate reason-giving will vary according to context (for other norms of assertion that impose primarily ‘down stream’ requirements on the speaker, see also Rescorla 2009 and MacFarlane 2009: 90ff.).

Goldberg (2009, 2011) initially applied the KNA to issues in the epistemology of testimony. More recently, Goldberg (2015) formulates and defends a context-sensitive norm on which knowledge is often required for permissible assertion—perhaps knowledge is even the default value—but in other contexts justification or reasonable belief might be enough, and in still other contexts, perhaps something even stronger than knowledge is required (certainty, perhaps). Goldberg draws on Grice’s (1989) maxim of quality, according to which assertions are governed by the first supermaxim and its two submaxims:

Quality: Try to make your contribution one that is true.

Do not say that which you believe to be false.
Do not say that for which you lack adequate evidence. (1989, 27)

Grice’s quality maxim, invoking as it does the notion of ‘adequate’ evidence, would seem to be just such a context-sensitive norm (though see Benton 2014a, for reasons to doubt this). Goldberg’s hypothesis is that there is Mutually-Manifest Epistemic Norm of Assertion (MMENA), which is comprised of a norm (ENA), and the context-sensitive mechanism (RMBS) which fixes the epistemic condition required by ENA:

ENA S must: assert p, only if S satisfies epistemic condition E with respect to p, i.e., only if S has the relevant warranting authority regarding p.

RMBS When it comes to a particular assertion that p, the relevant warranting authority regarding p depends in part on what it would be reasonable for all parties to believe is mutually believed among them (regarding such things as the participants’ interests and informational needs, and the prospects for high-quality information in the domain in question) (Goldberg, 2015, Chap. 12)

McKinnon’s (2013) Supportive Reasons Norm is designed to be similarly context-sensitive, and on a natural reading, Lackey’s Reasonable-to-Believe Norm can be understood this way as well; Stone (2007: 100-101) also prefers, but does not develop, a kind of context-sensitive norm opposed to the KNA. Such rival norms have the intuitive benefit of explaining a great range of conversational contexts in which we seem to assert acceptably; however with this flexibility comes the burden of having to provide plausible explanations of the data, considered in sections I.A and I.B above, which invoke knowledge.

Note however that opting for a context-sensitive norm need not mean that one eschews the KNA. DeRose (2002; 2009 Chap. 3) accepts a version of KNA, but regards “know(s)” as semantically context-sensitive. Thus the standard for the truth of “knowledge” ascriptions at a context sets the standard for permissible assertion: for a given speaker S in a conversational context C, the truth conditions for “S knows that p” at C are the assertibility conditions for S to assert that p in C. On this view, knowledge remains the norm of assertion. Relatedly, Schaffer (2008) argues for a contextualist version of KNA which he claims supports contrastivism about knowledge.

Many of the rival norms to KNA are motivated in part by the idea that KNA is just too strong an epistemic requirement on assertion: many KNA opponents find it implausible to think that one has done anything wrong by asserting what one doesn’t know, so long as one’s assertion, or one’s decision to assert p, is supported in the relevant way by adequate evidence or reasons for p (see McGlynn 2014 for a thorough discussion). Some of these objections to KNA come from appeals to intuitions about cases, in particular, cases in which one asserts with strong grounds or evidence, but one is in a Gettier situation, or what one asserts is unluckily false. In general, such cases appeal to what are judged to be blameless assertion (for concerns about relying on such judgments of blame, see Turri & Blouw 2014). Some proponents of KNA respond that in such cases one asserts reasonably if one reasonably took oneself to know, even though on KNA, one still asserts impermissibly: its being reasonable is what excuses one for having violated the norm, and the plausibility of calling it an ‘excuse’ suggests that a norm was violated (Williamson 2000: 256; DeRose 2009: 93-95, Sutton 2007: 80, Hawthorne & Stanley 2008: 573, 586); but this excuse maneuver has been heavily criticized for multiplying senses of propriety or for being too general (Lackey 2007, Gerken 2011, Kvanvig 2011a). See also Littlejohn 2012 and 2014 for extensive discussion of the notion of excuse, as related to these norms.

Other opponents of the KNA are particularly motivated to preserve the acceptability of our assertive practices within special contexts which are nevertheless familiar and ones in which it seems that we do assert, such as the philosophy seminar room (see Goldberg, 2015). Still others rely on intuitions about cases and a desire to give a normative role to the hearer of an assertion (see Pelling’s 2013b “knowledge provision” account). Some express skepticism at the very idea of there being a constitutive epistemic norm of assertion in Williamson’s sense, preferring instead the idea that more general norms of cooperation and rationality (perhaps those given by Grice) will suffice to explain any normativity in our practice of saying and asserting (e.g. Cappelen 2011; see Benton, 2014a, and Montgomery 2014 for discussion). Maitra (2011) in particular presents a challenge to Williamson’s way of formulating the notion of constitutive rules on analogy with the rules of a game. Yet the general idea that a constitutive epistemic norm can individuate speech acts has been deployed for other speech acts on the assertive spectrum: Turri (2013) thereby individuates the stronger speech act of guaranteeing, and Benton & Turri (2014) individuate the speech act of prediction.

The final rival to the KNA considered here is a Certainty Norm (Stanley 2008), on which to assert that p one must be (subjectively) certain that p. This norm is motivated in part by the idea that the Moorean conjunction schemas

(13) p but I’m not certain that p

(14) p but it is not certain that p

strike many to be just as problematic as the knowledge and belief conjunctions (1)-(4) considered above; a Certainty Norm could explain them, and if certainty is required for knowledge, it could also explain (1)-(4). However, the Certainty Norm inherits the ‘too strong’ objection with which many charge KNA, and as noted above, certainty does not figure in both prompts and challenges to assertions (Turri 2010). Also, it is unclear how the Certainty Norm will handle the truth desideratum insofar as conversational participants generally seem to care about truth, and not just a speaker’s confidence, in assertion.

d. Sufficiency

Even if KNA can seem to impose an overly demanding condition on the propriety of assertion, on first pass it might seem that knowledge at least provides a sufficient condition on epistemically permissible assertion. After all, this idea goes, even if some epistemic/alethic standard weaker than knowledge is necessary for permissible assertion, nevertheless surely having knowledge is sufficient. Most of the rivals to KNA ought to agree that when one knows, one thereby arguably meets the less stringent standards of: truth, it being reasonable/credible to believe, being justified in believing, and (if the contextually set standards for certainty do not easily come apart from those of knowledge) being certain enough to assert. Thus some of KNA’s defenders (cf. Hawthorne 2004: 23 n. 58, and 87; DeRose 2009: 93), and many of its opponents, could be tempted to endorse a sufficiency direction of the knowledge norm, such as the following:

(KNA-S) One is properly epistemically positioned to assert that p if one knows that p.

(As shall be seen below in section 2.c, similar sufficiency principles, tying knowledge to action, undergird pragmatic encroachment views of knowledge.)

But Lackey (2011, 2013) has argued that in fact, KNA-S is false (compare Pelling 2013a for another argument). She appeals to cases of what she calls isolated second-hand knowledge to show that in some settings, particularly those involving experts, asserting even though one knows is epistemically deficient. Consider a case in which an oncologist has referred her patient for lab tests, which arrive back on her day off. She must meet with the patient to provide the diagnosis, if any, and is only able to confer briefly with the oncologist from the lab about what the diagnosis is (that he has pancreatic cancer). The doctor can learn from her colleague’s testimony that her patient has pancreatic cancer, but this knowledge is isolated (she knows no other facts about the test results or the diagnosis), and entirely second-hand (via testimony with the lab oncologist). Given her epistemic situation, Lackey argues, it is intuitively (epistemically) impermissible for the doctor to assert to her patient that he has pancreatic cancer, even though she knows this. More generally, for experts asserting as experts, it seems that asserting with merely isolated second-hand knowledge is (epistemically) improper, because experts ought to engage their expertise first-hand, or ought to have more than isolated knowledge gained entirely through expert testimony. Thus Lackey argues that KNA-S is false. (See Carter & Gordon 2011 for an appeal to the idea that understanding is needed. For a challenge to Lackey’s cases, see Benton 2014b; for her reply, see Lackey 2014.)

2. Knowledge Norm of Action

Knowledge seems intimately connected to our reasons for, and our evaluations of, action. Recently many philosophers have endorsed normative connections between knowledge and action, and have deployed principles according to which knowledge is either necessary, sufficient, or both necessary and sufficient for appropriate action. Some of these discussions are focused on action as the result of practical reasoning, or on the connection between knowledge and reasons, or on knowledge as a sufficient epistemic position for acting on a proposition. We will consider these in turn.

a. Knowledge and Practical Reasoning

Some philosophers have noticed intuitive connections between knowledge, assertion, and practical reasoning (see Fantl & McGrath 2002; Hawthorne 2004, esp. 21-32, and Ch. 4; Stanley 2005; and Hawthorne & Stanley 2008). Many thus argue that knowledge plays an important normative role in practical reasoning: when one faces a decision over whether to act that depends on the truth of some proposition, then acting without knowing that proposition can seem epistemically suspect and deserving of criticism. We often invoke knowledge when justifying someone’s decision to act, and we often cite their lack of knowledge when censuring others for acting on inadequate grounds; knowledge figures in our appraisals of blame, negligence, and in conditional orders wherein one is commanded to X just in case one knows a specified condition to obtain.

These facts support the idea that one ought only to use known propositions as premises in one’s practical deliberations. For example, if you opt against purchasing very affordable health insurance, on the grounds that you are plenty healthy, you may be criticized by your loved ones precisely because you do not know that you will not soon fall gravely ill. To take another example: suppose that you spent a dollar on a lottery ticket in a 10,000 ticket lottery with a $5,000 prize, and you are deliberating about whether to sell your ticket. Suppose you reason as follows:

The ticket is a loser.

So if I keep the ticket, I will get nothing.

But if I sell the ticket, I will get a penny.

So I should sell the ticket. (Hawthorne 2004: 29, 85)

Such reasoning should strike us as unacceptable and a plausible reason for why is that the first premise isn’t known. Similarly, suppose that someone offered to sell you their ticket in the same lottery for a cent: if you decline on the basis that you know their ticket will lose, that may also strike us as the wrong basis for declining, for it seems (to many) that you don’t know the ticket will lose. Indeed, if you do know the first premise, standard decision theory validates the reasoning; this suggests that only one’s beliefs which amount to knowledge should figure in to shaping one’s decision table (cf. Weatherson 2012).

These kinds of considerations suggest the following necessary direction norm, Action-Knowledge Principle (AKP), which gives a necessary condition on appropriately treating a proposition as a reason for acting:

(AKP) Treat the proposition that p as a reason for acting only if you know that p (Hawthorne & Stanley 2008: 578)

AKP plausibly lies behind our epistemic evaluations of actions, and also provides a nice diagnosis of some comparative intuitions about low stakes vs. high stakes cases (e.g. Stanley 2005, 9-10).

A parallel debate concerns the idea that there is a common epistemic norm—say, knowledge, or perhaps epistemic ‘warrant,’ or justification—which provides a necessary condition on both appropriate assertion in particular and appropriate action/practical reasoning more generally: see Brown 2011 and 2012, Montminy 2013, Gerken 2013. As we will see in the next section, a structurally similar question concerns whether a common epistemic norm governs practical reason as well as theoretical reason (that is, on what one can appropriately take as a reason for believing).

Some important criticisms of AKP are the following. First, as with the KNA above, it doesn’t license acting on p when one holds a justified belief that p; indeed, one might be Gettiered with respect to p (see Brown 2008, Neta 2009). Acting on p in such cases seems to many to be entirely appropriate, and thus these are counterexamples to AKP. As with the KNA, the reply (Hawthorne & Stanley 2008: 573-74, 586) is that such subjects are blameless for making an excusable mistake, and the need for an excuse is explained by AKP.

Second, it has been objected that AKP doesn’t license acting on subjective probabilities of a proposition, and thus that it can seem in conflict with Bayesian decision theory. Sometimes one is only in a position to treat propositions that are probable for one as reasons for acting; Cresto (2010) argues that when probabilistic talk is interpreted in subjectivist terms, AKP can be violated even though it seems as though one has done nothing wrong. On standard Bayesian decision theory, one plugs one’s probabilities, along with one’s values for possible outcomes, into one’s decision table to discern the act which maximizes expected utility. If you assign 0.7 probability to (have 0.7 credence in) the proposition that it will rain, and on that basis choose to carry an umbrella on your walk, have you violated AKP? Perhaps not, for if you know that you assign 0.7 probability to it raining, and use this knowledge as your reason for acting, then you do not violate AKP: the proposition that you treat as your reason for so acting is that rain is 0.7 probable (Hawthorne & Stanley 2008: 580-583). Arguably, one’s credences are not always luminous to one, and thus there is still a role for knowledge (and thus AKP) to play. Weatherson (2012) argues that the role for knowledge in decision theory is that it sets the standard for what gets on to one’s decision table; moreover, it might be that one’s credences can constitute knowledge (Moss 2013), and if so there is room for AKP to govern actions based on them. But still, it might be implausible to suppose that every such case of appropriately acting on a probability involves your knowing what your credence is: though your credence in it may be 0.7 on this occasion, this may not be transparent to you. It may be sufficient for you to act on the more coarse-grained probability that it’s more likely than not that it will rain, even if you do not form the belief that it is more likely than not that will rain. On this way of looking at things, the objection remains. For important constructive work adjudicating these issues and proposing some ways in which a knowledge norm for practical reasoning and Bayesian decision theory are compatible, see Weisberg (2013).

b. Knowledge and Reasons

We standardly cite reasons as propositions which ought to make a difference to someone’s decision to act one way or another. Such normative reasons are reasons there are for a particular agent to believe, feel, or act a certain way. (Such reasons are distinguishable from both explanatory reasons—reasons why an agent believed or felt or acted—and from motivating reasons—reasons for which an agent acted in a particular way.) Normative reasons can be either possessed by an agent or not possessed by an agent: if Iris is at the bar and there is petrol in the glass in front of her, then there is a reason for her not to drink the liquid in her glass, but it will not be a reason Iris possesses unless she is aware that there is petrol in the glass.

A natural way to approach the connection between knowledge and action is by noting that possessing a reason for some action arguably depends on knowing a proposition, and that lacking knowledge can rob one of possessing the relevant reason (see Hyman 1999, Unger 1975, Ch. 5, Alvarez 2010, and Littlejohn 2014). If Iris knows that there is petrol in the glass, then that is a reason she possesses to refrain from drinking it; but if she does not know it, then she does not possess that reason to refrain, even though there is a reason for her to refrain. There being petrol in the glass can only be a reason Iris possesses if she knows it.

This view connects naturally with the above discussion of the normative relation between knowledge and action: where one treats a proposition as one’s reason for action, and then acts for that reason, one only acts properly when one knows that proposition. This is because, on the view being considered, one cannot possess p as a reason to ϕ unless one knows that p. Of course, one’s motivating reason for ϕ-ing might be a falsehood: one might falsely believe that q and thereby take q as one’s reason for ϕ-ing, and one’s belief that q explains why one ϕ’d. On the view being considered then, one cannot in that circumstance have had q as a reason, for one cannot (because q is false) know that q. That is, the reasons one takes to be one’s reasons can come apart from the reasons one in fact possesses. If this is correct, it has consequences for how to understand the normative concept of justification. In particular, knowledge figures importantly in understanding what reasons justify one in believing or in acting, such that the mark of justification is not an internalist or subjectivist notion of rationality but instead an externalist or objectivist notion explicable in terms of facts or knowledge of facts. See Littlejohn (2014) for more.

Some philosophers question the claim, crucial to the above line of reasoning, that one can possess p as a reason, or properly treat p as a reason for acting, only if p is true (and known). Comesaña and McGrath (2014) call this “factualism about reasons-had,” and against it they argue that one can have false reasons (see also Schroeder 2008, Fantl & McGrath 2009: 100-104, and Dancy 2014, among others). The case for the possibility of having false reasons is built primarily upon two ideas. First, it seems to them that ascribing a reason to someone for their action can be done even if that reason is (or entails) a false proposition. That is, they claim that one could acceptably say of someone that “The reason she turned down the job was that she had another job offer,” even if she did not have another job offer and the speaker knows this. Second, when someone acts on a mistaken belief, there is pressure to claim that she acted for the same reason as she would’ve had her belief in fact been true. On this way of looking at things, there must be the same psychological state that rationalizes Iris’s taking and drinking from the glass with petrol in it as would (counterfactually) rationalize Iris’s taking and drinking from a glass with gin and tonic in it; in other words, such views take what it is that rationalizes to be what it is that provides ones with reasons, both motivating and normative: one has the same normative reasons in both the good and bad cases. Such views are at odds with the standard semantics about schemas such as “S’s reason for X-ing was/is that p” or “The reason S had for X-ing was that p”, which entail that p and so are factive; see Comesaña and McGrath (2014) for ways of handling these semantic issues.

As noted in the last section, there is a parallel question about whether the epistemic norm governing practical reason is the same as that governing theoretical reason. Hawthorne & Stanley’s AKP is a knowledge norm on practical reason, but they also note the analogous principle regarding reasons for belief:

(TKP) Treat the proposition that p as a reason for believing q only if one knows that p. (2008, 577)

Littlejohn (2014) notes a compelling argument that AKP is true just in case TKP, and that more generally, whatever epistemic status norms practical reason must also norm theoretical reason. The argument for it goes thus. Suppose (for reductio) that in fact, the norm for theoretical reason were less epistemically demanding than that for practical reason: for concreteness, suppose that one could treat p as a reason for believing that q only if one were justified in believing that p, but that knowledge still governed practical reason along the lines suggested by AKP. In that case, if you justifiably believe that this liquid is gin, and you knew that you ought (if you can) to make another round of drinks for your guests, you could take your justified belief that it is gin as your reason for believing that: you can make them another round of drinks. But AKP says that you may treat that latter proposition (that you can make them another round of drinks) as a reason only if you know it; and let’s suppose you don’t know it, because in fact it’s not gin but petrol. In this situation, though it’s proper for you to treat your justified belief as a reason to form another belief, AKP says that you cannot properly treat this new belief as a reason for acting, namely making another round of drinks. If the epistemic norms diverged in this way, they would “demand that you were akratic,” and this seems absurd (Littlejohn 2014: 135-136). Things go similarly if the divergence goes the other way, namely if the norm of theoretical reason were more demanding than the norm of action: together these would permit situations in which one can act on a proposition (say, because one justifiably believes it), but not use it as a premise from which to deduce, and form beliefs in, other propositions. Thus there is a case for the unity thesis that a single epistemic status governs both practical and theoretical reasoning, even if it is not knowledge; for arguments that it is something weaker than knowledge, like justification or warrant, see Gerken (2011).

c. Sufficiency and Pragmatic Encroachment

Though Fantl & McGrath question the necessary direction principles AKP, they and others do endorse and defend sufficiency direction principles on which knowledge of a proposition is sufficient to rationalize acting on that proposition. Hawthorne & Stanley (2008, 578) defend a biconditional principle which adds to AKP a sufficiency direction, given a choice one faces which depends on a particular proposition. Where a choice between options X1… Xn is “p-dependent” just in case the most preferable of X1… Xn conditional on the proposition that p is not the same as the most preferable of X1… Xn conditional on the proposition that not-p, the Reason-Knowledge Principle (RKP) says:

(RKP) Where one’s choice is p-dependent, it is appropriate to treat the proposition that p as a reason for acting just in case one knows that p.

RKP gives necessary and sufficient conditions for appropriately treating a proposition as a reason for acting. Similarly Fantl & McGrath (2002, 2009, 2012) defend at length a variety of sufficiency conditions tying knowledge to action:

(Action) If you know that p, then if the question of whether p is relevant to the question of what to do, it is proper for you to act on p.

(Preference) If you know that p, then you are rational to prefer as if p.

(Inquiry) If you know that p, then you are proper not to inquire further into whether p.

(KJ) If you know that p, then p is warranted enough to justify you in ϕ-ing, for any ϕ.

On the face of them, these principles can seem exactly right: for example, it might seem obvious that if one knows a proposition, then one is in good enough position to act upon it. But these principles admit of modus tollens as well: if it is not proper for one to act on p, or rational to prefer as if p, or proper to close off inquiry regarding p, or where p is not warranted to enough for one to act, for any action one considers undertaking, then one does not know that p. These principles bear out the intuitive judgments of many about such cases: to the extent that one’s epistemic position in some p seems inadequate when facing a decision that depends on that p, to that same extent we tend to be inclined to deny that one knows that p. That is, in cases where the practical stakes for one make it irrational for one to act on a proposition, such principles entail that one does not know that proposition (even though in other contexts where one faces no such decision, where one has the same evidence or is in the same “epistemic” position, one might know that proposition). Thus such views endorse “pragmatic encroachment” in epistemology (also known as “subject-sensitive invariantism” in Hawthorne 2004: Ch. 4, Brown 2008, and DeRose 2009), for practical considerations can seem to encroach on whether one knows. See Neta 2009 and Kvanvig 2011b for some criticisms, and Fantl & McGrath 2012 for arguments that pragmatic encroachment isn’t only about knowledge.

3. Knowledge Norm of Belief

a. The Belief-Assertion Parallel

Some philosophers (going back to at least Frege, Peirce, and Ramsey) find plausible the idea that belief or judgment amount to a kind of “inner assertion” where (full) belief is the inner analogue to outward (flat-out) assertion. For those inclined to this view who also accept the Knowledge Norm of Assertion, there is a motivation to accept a parallel Knowledge Norm of Belief. Williamson gestures at this idea thus:

It is plausible, nevertheless, that occurrently believing p stands to asserting p as the inner stands to the outer. If so, the knowledge rule for assertion corresponds to the norm that one should believe p only if one knows p. Given that norm, it is not reasonable to believe p when one knows that one does not know p. (2000, 255-56)

Adler (2002: 276ff.) calls this idea the “Belief-Assertion Parallel,” and offers a range of considerations suggesting that belief and assertion are on a par in this way.

Note however, that this Parallel is likewise intuitive should one prefer some kind of evidential or justification norm, rather than a knowledge norm, on both assertion and on belief. If, epistemically speaking, one shouldn’t assert to others that p without some sufficient evidence or justification for p, then one shouldn’t (epistemically speaking) believe that p without some similar sufficient evidence or justification for p; and in reverse, if (epistemically speaking) one shouldn’t believe that p without some sufficient evidence or justification for p, then one shouldn’t (epistemically speaking) assert to others that p without some similar sufficient evidence or justification for p. Thus to the extent that one finds the epistemic standard for assertion to be similar, if not identical, to the epistemic standard for belief, to that extent the Belief-Assertion Parallel will seem intuitive. Only if one takes the standard for one to be higher than the standard for the other will one be motivated to reject the Belief-Assertion Parallel. (For in-depth discussion, see Goldberg 2015, Chs. 6 and 7.)

Though Williamson does not formulate it explicitly, taking a cue from his KNA schema would provide us with a similar formulation for a Knowledge Norm of Belief, which gives a necessary condition for the propriety of belief:

(KNB) One must: believe p only if one knows p.

(Compare Sutton 2005, 2007; for clarification of how best to understand a norm like KNB, see Jackson 2012.) In addition to the inner/outer parallel noted above, Williamson also provides a different consideration in favor of KNB, one that invokes teleological considerations concerning the “aim” of belief:

If believing p is, roughly, treating p as if one knew p, then knowing is in that sense central to believing. Knowledge sets the standard of appropriateness for belief. That does not imply that all cases of knowing are paradigmatic cases of believing, for one might know p while in a sense treating p as if one did not know p—that is, while treating p in ways untypical of those in which subjects treat what they know. Nevertheless, as a crude generalization, the further one is from knowing p, the less appropriate it is to believe p. Knowing is in that sense the best kind of believing. Mere believing is a kind of botched knowing. In short, belief aims at knowledge (not just truth). (Williamson 2000, 47)

Notice that the KNB provides an elegant and unified account of Moore’s Paradox at the level of belief, a desideratum of many approaches to theorizing about Moore’s Paradox (e.g. Sorenson 1988): these authors note that while the sentences (1)-(4), uttered assertively, are absurd, it also seems absurd to believe (the propositions of) any of their conjuncts together. Huemer (2007) argues explicitly for the idea that theorizing about Moorean conjunctions in this way should lead us to accept both KNA and KNB.

Sosa (2010/2011, Chap. 3: 41-53) provides an interesting argument for another version of the Belief-Assertion Parallel, which arrives at norms similar to KNA and KNB, but he does so by explicit appeal to teleological considerations about the aim of belief. Sosa argues for what he calls the Affirmative Conception of Belief (2011: 41; cf. Sosa 2014):

Consider a concept of affirming that p, defined as: concerning the proposition that p, either (a) asserting it publicly, or (b) assenting to it privately.

With this Affirmative Conception in hand, he then applies considerations from the propriety of means-end action in general to the action of assertion as a special case, using the terminology of his virtue-theoretic epistemology (cf. Sosa 2007):

If one asserts that p as means thereby to assert that p with truth, this essentially involves the relevant means-end belief. I mean the belief that asserting that p is a means to thereby assert that p with truth. And this belief is equivalent to the belief that p. Accordingly, if that means-end belief needs to amount to knowledge in order for the means-end action to be apt, then in order for a sincere assertion that p to be apt, the agent must know that p. In this way, knowledge is a norm of assertion. If an assertion (in one’s own person) that p is not to fall short epistemically it must be sincere, and a sincere assertion that p will be apt only if the subject knows that p. This is, moreover, not just a norm in the sense that the subject does better in his assertion that p provided he knows that p. Rather, if his assertion is not apt, it then fails to meet minimum standards of performance normativity. Any performance (with an aim) that is inapt is thereby flawed. … Knowledge is said to be necessary for proper assertion … If knowledge is the norm of assertion, it is plausibly also the norm of affirmation, whether the affirming be private or public. (2011: 48)

Sosa goes on to develop an intriguing argument for the “equivalence” of the knowledge norm of assertion and the value-of-knowledge thesis (2011: 49-52). For a related view tying the norms of belief and assertion to a virtue-theoretic account, see Wright (2014).

Instructive here is Bach’s combination of views (Bach & Harnisch 1979, Bach 2008). Bach holds a Belief Norm of Assertion, on which the only norm fundamental to assertion is that assertions must be sincere (one must outright believe what one flat-out asserts), but he also holds a Knowledge Norm of Belief much like KNB (2008: 77). Because Bach accepts the KNB, he gets a derived version of the KNA: for one must believe only what one knows, but given his Belief Norm of Assertion, one must assert only what one believes; thus one must assert only what one knows, if one is believing as one ought. This combination of views is one which accepts KNB, accepts (the derivative) KNA, but which denies the Belief-Assertion Parallel at the level of what norms are constitutive of assertion and of belief.

An objection to the KNB, similar in spirit to objections to KNA considered above, is that many find it implausible to hold that one is doing epistemically poorly, or doing anything epistemically impermissible, by believing many propositions which we nevertheless do not know, and which we furthermore properly take ourselves not to know. For some important criticisms of KNB, stemming from arguments that there is nothing epistemically problematic or improper about lottery propositions, see McGlynn (2013, 2014). Relatedly, while most find it incoherent or irrational to believe the Moorean conjunction form (1) considered above, many find it unproblematic to believe some conjunctions of the form (2), namely believing a proposition and also believing that one does not know that proposition. Those who object to KNB on these grounds tend to deny a parallel between the epistemic standard for belief and the epistemic standard for knowledge. Couched in evidential terminology, many epistemologists intuitively think of belief in terms of an evidence-threshold model, according to which the evidential threshold which one must meet in order permissibly belief some proposition is lower than the evidential threshold for knowledge: more evidence is required to know than to (permissibly) believe.

b. Knowledge Disagreement Norm

In a spirit related to considerations stemming from endorsement of the KNB, Hawthorne & Srinivasan (2013) argue for a Knowledge Norm of Disagreement. In the growing literature on the epistemology of disagreement, debate ensues over what one should do in the face of disagreement about some proposition, particularly when those disagreeing with one are regarded as one’s intellectual or evidential peers. Typically such cases of peer disagreement are formulated such that you have formed a belief or a judgment on (or assigned a credence to) some proposition p, and have done so on the basis of some evidence: perhaps it is a judgment about which of two horses won a very close race, and the evidence is visual; or perhaps it is a judgment about what you and your friend each owe from calculating your share of a restaurant bill which you are splitting, in which case the evidence is intellectual and inferential. Many philosophers writing on such cases of disagreement are “conciliationists” of one sort or another, that is, they endorse the idea that in some such disagreements, one does something improper or irrational if one does not either suspend judgment on p, or reduce one’s credence in p. Opposed to conciliationists are “dogmatists” who advocate the idea that in face of such disagreements, it is sometimes appropriate or rational for one to hold steadfast or “stick to one’s guns” by retaining one’s belief or one’s credence. (See essays in Feldman & Warfield 2010, and Christensen & Lackey 2013 for more.)

Hawthorne & Srinivasan (2013: 11-12), drawing on a knowledge-centric epistemology which takes knowledge to be the central goal of our epistemic activity, articulate a position which is in some ways a middle ground between these two views. They argue for the following Knowledge Disagreement Norm:

(KDN) In a case of disagreement about whether p, where S believes that p and H believes that not-p:

(i) S ought to trust H and believe that not-p iff were S to trust H, this would result in S’s knowing not-p

(ii) S ought to dismiss H and continue to believe that p iff were S to stick to her guns this would result in S’s knowing p, and

(iii) in all other cases, S ought to suspend judgment about whether p.

KDN’s ‘ought’ clauses are motivated by a ranking of actions according to their counterfactual outcomes: according to KDN’s clause (i), one should be ‘conciliatory’ in the face of disagreement just in case trusting one’s disagreeing interlocutor would result in one gaining knowledge, whereas according to clause (ii), one should be ‘dogmatic’ in the face of disagreement just in case would lead to retaining one’s knowledge. Finally, in cases where neither party knows whether the proposition under dispute is true, each should suspend judgment.

Notice that KDN, formulated in the terminology of knowledge and outright belief, is neutral on the matter of how to respond when the ‘disagreement’ concerns divergences in credences toward a proposition: its clause (iii) is capable of accommodating many different approaches here. Further, KDN is fully general in that it does not hold only for cases of peer disagreement: its clauses (i) and (ii) are designed to capture the appropriateness of occasions in which someone defers to an expert or someone in a better evidential position, and thereby can come to know by trusting them. If it is plausible to suppose that becoming apprised of peer disagreement can defeat one’s knowledge, then such cases may be subsumed to clause (iii) (2013: 13-14, 21ff). Finally, KDN has the merit that, if followed, knowledge will tend to be maximized for all parties to a disagreement: if we disagree, but by trusting you, I can come to know what you believe, I ought to do so.

It may be objected that KDN is not easily followed, precisely because knowledge is a non-luminous condition, that is, one is not always in a position to know when one knows; and this is particularly pressing in the case of disagreement, for it is clear that (at least) one of the disagreeing parties doesn’t know, and it can be utterly unclear to most such disputants which one (if any) knows. This objection, and similar objections that are occasionally pressed against the norms of assertion and practical reasoning covered in earlier sections, seems to assume that norms must be perfectly operationalizable, that is, they must be such that one is always in a position to know whether one is complying with them (Williamson 2008). On this idea, a norm N, which requires that one X in circumstances C, will be perfectly operationalizable just in case S can know she is in C, and is thus in a position to reason that, given that she is in C, and could X by A-ing, and that N says she ought to X in C, that S ought to A. But it is a substantive question whether norms are or must be perfectly operationalizable; and given that many such conditions of epistemological interest are arguably non-luminous (see Williamson 2000: Ch. 4), one might reconsider the worth of that assumption. For more discussion of this issue and how it relates to the hypological categories of praise and blame, see Hawthorne & Srinivasan (2013: 15-21).

4. References and Further Reading

Adler, Jonathan. 2002. Belief’s Own Ethics. Cambridge: MIT Press.
Alvarez, Maria. 2010. Kinds of Reasons. Oxford University Press.
Bach, Kent, and R. Michael Harnish. 1979. Linguistic Communication and Speech Acts. Cambridge: MIT Press.
Bach, Kent. 2008. “Applying Pragmatics to Epistemology.” Philosophical Issues 18: 68-88.
Benton, Matthew A. 2011. “Two More for the Knowledge Account of Assertion.” Analysis 71: 684-687.
Benton, Matthew A. 2012. “Assertion, Knowledge, and Predictions.” Analysis 72: 102-105.
Benton, Matthew A. 2014a. “Gricean Quality.” Noûs.
Benton, Matthew A. 2014b. “Expert Opinion and Second-Hand Knowledge.” Philosophy and Phenomenological Research.
Benton, Matthew A. and John Turri. 2014. “Iffy Predictions and Proper Expectations.” Synthese 191: 1857-1866.
Blaauw, Martijn. 2012. “Reinforcing the Knowledge Account of Assertion.” Analysis 72: 105-108.
Black, Max. 1952. “Saying and Disbelieving.” Analysis 13: 25–33.
Brogaard, Berit. 2014. “Intellectual Flourishing as the Fundamental Epistemic Norm.” In Clayton Littlejohn and John Turri (eds.), Epistemic Norms: New Essays on Action, Assertion, and Belief. Oxford: Oxford University Press.
Brown, Jessica. 2008. “Subject-Sensitive Invariantism and the Knowledge Norm for Practical Reasoning.” Noûs 42: 167-189.
Brown, Jessica. 2010. “Knowledge and Assertion.” Philosophy and Phenomenological Research 81: 549-566.
Brown, Jessica. 2011. “Fallibilism and the Knowledge Norm for Assertion and Practical Reasoning.” In Jessica Brown and Herman Cappelen (eds.), Assertion: New Philosophical Essays. Oxford: Oxford University Press.
Brown, Jessica. 2012. “Assertion and Practical Reasoning: Common or Divergent Epistemic Standards?” Philosophy and Phenomenological Research 84: 123-157.
Buckwalter, Wesley and John Turri. 2014. “Telling, Showing, and Knowing: A Unified Theory of Pedagogical Norms.” Analysis 74: 16-20.
Cappelen, Herman. 2011. “Against Assertion.” In Jessica Brown and Herman Cappelen (eds.), Assertion: New Philosophical Essays. Oxford: Oxford University Press.
Carter, J. Adam and Emma Gordon. 2011. “Norms of Assertion: The Quantity and Quality of Epistemic Support.” Philosophia 39: 615-635.
Christensen, David and Jennifer Lackey (eds.). 2013. The Epistemology of Disagreement: New Essays. Oxford: Oxford University Press.
Coffman, E.J. 2014. “Lenient Accounts of Warranted Assertability.” In Clayton Littlejohn and John Turri (eds.), Epistemic Norms: New Essays on Action, Assertion, and Belief. Oxford: Oxford University Press.
Comesaña, Juan and Matthew McGrath. 2014. “Having False Reasons.” In Clayton Littlejohn and John Turri (eds.), Epistemic Norms: Assertion, Action, and Belief. Oxford: Oxford University Press.
Cresto, Eleonora. 2010. “On Reasons and Epistemic Rationality.” Journal of Philosophy 107: 326-330.
Dancy, Jonathan. 2014. “On Knowing One’s Reason.” In Clayton Littlejohn and John Turri (eds.), Epistemic Norms: Assertion, Action, and Belief. Oxford: Oxford University Press.
DeRose, Keith. 2002. “Assertion, Knowledge, and Context.” Philosophical Review 111: 167-203.
DeRose, Keith. 2009. The Case for Contextualism. Oxford: Oxford University Press.
Douven, Igor. 2006. “Assertion, Knowledge, and Rational Credibility.” Philosophical Review 115: 449-485.
Douven, Igor. 2009. “Assertion, Moore, and Bayes.” Philosophical Studies 144: 361-375.
Fantl, Jeremy and Matthew McGrath. 2002. “Evidence, Pragmatics, and Justification.” Philosophical Review 111: 67-94.
Fantl, Jeremy and Matthew McGrath. 2009. Knowledge in an Uncertain World. Oxford: Oxford University Press.
Fantl, Jeremy and Matthew McGrath. 2012. “Pragmatic Encroachment: It’s Not Just about Knowledge.” Episteme 9: 27-42.
Feldman, Richard and Ted Warfield (eds.). 2010. Disagreement. Oxford: Oxford University Press.
Gerken, Mikkel. 2011. “Warrant and Action.” Synthese 178: 529-547.
Gerken, Mikkel. 2012. “Discursive Justification and Skepticism.” Synthese 189: 373-394.
Gerken, Mikkel. 2013. “Same, Same but Different: The Epistemic Norms of Assertion, Action, and Practical Reasoning.” Philosophical Studies 168: 725-744.
Goldberg, Sanford C. 2009. “The Knowledge Account of Assertion and the Nature of Testimonial Knowledge.” In Patrick Greenough and Duncan Pritchard (eds.). Williamson on Knowledge. Oxford: Oxford University Press.
Goldberg, Sanford C. 2011. “Putting the Norm of Assertion to Work: The Case of Testimony.” In Jessica Brown and Herman Cappelen (eds.), Assertion: New Philosophical Essays. Oxford: Oxford University Press.
Goldberg, Sanford C. 2015. Assertion: The Philosophical Significance of a Speech Act. Oxford: Oxford University Press.
Grice, Paul. 1989. Studies in the Way of Words. Cambridge: Harvard University Press.
Hawthorne, John. 2004. Knowledge and Lotteries. Oxford: Oxford University Press.
Hawthorne, John and Jason Stanley. 2008. “Knowledge and Action.” Journal of Philosophy 105: 571-590.
Hawthorne, John and Amia Srinivasan. 2013. “Disagreement Without Transparency: Some Bleak Thoughts.” In David Christensen and Jennifer Lackey (eds.), The Epistemology of Disagreement: New Essays. Oxford: Oxford University Press.
Huemer, Michael. 2007. “Moore’s Paradox and the Norm of Belief.” In Susana Nuccetelli and Gary Seay (eds.), Themes from G.E. Moore: New Essays in Epistemology and Ethics. Oxford: Clarendon Press.
Hyman, John. 1999. “How Knowledge Works.” Philosophical Quarterly 49: 433-451.
Jackson, Alexander. 2012. “Two Ways to Put Knowledge First.” Australasian Journal of Philosophy 90: 353-369.
Kvanvig, Jonathan L. 2009. “Assertions, Knowledge, and Lotteries.” In Patrick Greenough and Duncan Pritchard (eds.), Williamson on Knowledge. Oxford: Oxford University Press.
Kvanvig, Jonathan L. 2011a. “Norms of Assertion.” In Jessica Brown and Herman Cappelen (eds.), Assertion: New Philosophical Essays. Oxford: Oxford University Press.
Kvanvig, Jonathan L. 2011b. “Against Pragmatic Encroachment.” Logos & Episteme 2: 77-85.
Lackey, Jennifer. 2007. “Norms of Assertion.” Noûs 41: 594-626.
Lackey, Jennifer. 2011. “Assertion and Isolated Second-Hand Knowledge.” In Jessica Brown and Herman Cappelen (eds.), Assertion: New Philosophical Essays. Oxford: Oxford University Press.
Lackey, Jennifer. 2013. “Deficient Testimonial Knowledge.” In Tim Henning and David P. Schweikard (eds.), Knowledge, Virtue, and Action: Putting Epistemic Virtues to Work. New York: Routledge.
Lackey, Jennifer. 2014. “Assertion and Expertise.” Philosophy and Phenomenological Research.
Littlejohn, Clayton. 2012. Justification and the Truth-Connection. Cambridge University Press.
Littlejohn, Clayton. 2013. “The Russellian Retreat.” Proceedings of the Aristotelian Society 113: 293-320.
Littlejohn, Clayton. 2014. “The Unity of Reason.” In Clayton Littlejohn and John Turri (eds.), Epistemic Norms: Assertion, Action, and Belief. Oxford: Oxford University Press.
MacFarlane, John. 2011. “What is Assertion?” In In Jessica Brown and Herman Cappelen (eds.), Assertion: New Philosophical Essays. Oxford: Oxford University Press.
Maitra, Ishani. 2011. “Assertion, Norms, and Games.” In Jessica Brown and Herman Cappelen (eds.), Assertion: New Philosophical Essays. Oxford: Oxford University Press.
Maitra, Ishani and Brian Weatherson. 2010. “Assertion, Knowledge, and Action.” Philosophical Studies 149: 99-118.
McGlynn, Aidan. 2013. “Believing Things Unknown.” Noûs 47: 385-407.
McGlynn, Aidan. 2014. Knowledge First? Basingstoke: Palgrave-Macmillan.
McKinnon, Rachel. 2013. “The Supportive Reasons Norm of Assertion.” American Philosophical Quarterly 50: 121-135.
McKinnon, Rachel and John Turri. 2013. “Irksome Assertions.” Philosophical Studies 166: 123-128.
Montgomery, Brian. 2014. “In Defense of Assertion.” Philosophical Studies. [published online Early View]
Montminy, Martin. 2013. “Why Assertion and Practical Reasoning Must Be Governed by the Same Epistemic Norm.” Pacific Philosophical Quarterly 94: 57-68.
Moore, G.E. 1942. “A Reply to My Critics.” In Paul Arthur Schilpp (ed.), The Philosophy of G.E. Moore, The Library of Living Philosophers. La Salle: Open Court Press. 3rd edn.: 1968.
Moore, G.E. 1962. Commonplace Book: 1919–1953. London: George Allen & Unwin.
Moore, G.E. 1993. “Moore’s Paradox.” In Thomas Baldwin (ed.), G.E. Moore: Selected Writings, 207–212. London: Routledge.
Moss, Sarah. 2013. “Epistemology Formalized.” Philosophical Review 122: 1-43.
Neta, Ram. 2009. “Treating Something as a Reason For Action.” Noûs 43: 684-699.
Pelling, Charlie. 2011. “A Self-Referential Paradox for the Truth Account of Assertion.” Analysis 71: 688.
Pelling, Charlie. 2013a. “Paradox and the Knowledge Account of Assertion.” Erkenntnis 78: 977-978.
Pelling, Charlie. 2013b. “Assertion and the Provision of Knowledge.” Philosophical Quarterly 63: 293-312.
Rescorla, Michael. 2009. “Assertion and its Constitutive Norms.” Philosophy & Phenomenological Research 79: 98-130.
Schaffer, Jonathan. 2008. “Knowledge in the Image of Assertion.” Philosophical Issues 18: 1-19.
Schroeder, Mark. 2008. “Having Reasons.” Philosophical Studies 139: 57-71.
Slote, Michael. 1979. “Assertion and Belief.” In Jonathan Dancy (ed.), Papers on Language and Logic. Keele University Library, pp. 177-90. Repr. in Slote, Selected Essays. New York: Oxford University Press, 2010.
Sorensen, Roy. 1988. Blindspots. New York: Oxford University Press.
Sosa, Ernest. 2007. A Virtue Epistemology: Apt Belief and Reflective Knowledge, volume 1. Oxford: Clarendon Press.
Sosa, Ernest. 2010. “Value Matters in Epistemology.” Journal of Philosophy 107: 167-190.
Sosa, Ernest. 2011. Knowing Full Well. Princeton: Princeton University Press.
Sosa, Ernest. 2014. “Epistemic Agency and Judgment.” In Clayton Littlejohn and John Turri (eds.), Epistemic Norms: Assertion, Action, and Belief. Oxford: Oxford University Press.
Stanley, Jason. 2005. Knowledge and Practical Interests. Oxford: Oxford University Press.
Stanley, Jason. 2008. “Knowledge and Certainty.” Philosophical Issues 18: 35-57.
Stone, Jim. 2007. “Contextualism and Warranted Assertion.” Pacific Philosophical Quarterly 88: 92-113.
Sutton, Jonathan. 2005. “Stick to What You Know.” Noûs 39: 359-396.
Sutton, Jonathan. 2007. Without Justification. Cambridge: MIT Press.
Turri, John. 2010. “Prompting Challenges.” Analysis 70: 456-462.
Turri, John. 2011. “The Express Knowledge Account of Assertion.” Australasian Journal of Philosophy 89: 37-45.
Turri, John. 2013a. “Knowledge Guaranteed.” Noûs 47: 602-612.
Turri, John. 2013b. “The Test of Truth: An Experimental Investigation of the Norm of Assertion.” Cognition 129: 279-291.
Turri, John. 2014a. “Knowledge and Suberogatory Assertion.” Philosophical Studies 167: 557-567.
Turri, John. 2014b. “You Gotta Believe.” In Clayton Littlejohn and John Turri (eds.), Epistemic Norms: Assertion, Action, and Belief. Oxford: Oxford University Press.
Turri, John and Peter Blouw. 2014. “Excuse Validation: A Study in Rule-Breaking.” Philosophical Studies.
Unger, Peter. 1975. Ignorance: The Case for Skepticism. Oxford: Clarendon Press. Reissued 2002.
Weatherson, Brian. 2012. “Knowledge, Bets, and Interests.” In Jessica Brown and Mikkel Gerken (eds.), Knowledge Ascriptions. Oxford: Oxford University Press.
Weiner, Matthew. 2005. “Must We Know What We Say?” Philosophical Review 114: 227-251.
Weisberg, Jonathan. 2013. “Knowledge in Action.” Philosophers’ Imprint 13: 1-23.
Whiting, Daniel. 2013. “Stick to the Facts: On the Norms of Assertion.” Erkenntnis 78: 847-867.
Williamson, Timothy. 1996. “Knowing and Asserting.” Philosophical Review 105: 489-523.
Williamson, Timothy. 2000. Knowledge and its Limits. Oxford: Oxford University Press.
Williamson, Timothy. 2008. “Why Epistemology Cannot be Operationalized.” In Quentin Smith (ed.), Epistemology: New Philosophical Essays. Oxford: Oxford University Press.
Williamson, Timothy. 2009. “Replies to Critics.” In Patrick Greenough and Duncan Pritchard (eds.). Williamson on Knowledge. Oxford: Oxford University Press.
Wright, Sarah. 2014. “The Dual-Aspect Norms of Belief and Assertion: A Virtue Approach to Epistemic Norms.” In Clayton Littlejohn and John Turri (eds.), Epistemic Norms: Assertion, Action, and Belief. Oxford: Oxford University Press.

Author Information

Matthew A. Benton
Email: matthew.benton@philosophy.ox.ac.uk
University of Oxford
United Kingdom

Legal Validity

Legal validity governs the enforceability of law, and the standard of legal validity enhances or restricts the ability of the political ruler to enforce his will through legal coercion. Western law adopts three competing standards of legal validity. Each standard emphasizes a different dimension of law (Berman 1988, p. 779), and each has its own school of jurisprudence.

Legal positivism emphasizes law’s political dimension. Legal positivism recognizes political rulers as the only source of valid law and adopts the will of the political ruler as its validity standard. Leading legal positivists include Jeremy Bentham, John Austin, and H.L.A. Hart.

Natural law theory emphasizes law’s moral dimension. Natural law theory recognizes universal moral principles as the primary source of valid law. These moral principles provide a standard of legal validity that imposes moral limits on the ruler’s coercive powers. Leading natural law theorists include Aristotle, Cicero, Justinian, and Thomas Aquinas.

The historicist school emphasizes law’s historical dimension. The historicist school recognizes legal custom as the primary source of valid law. Legal custom provides a standard of legal validity that imposes customary limits on the political ruler’s coercive powers. Leading historicists include Sir Edward Coke, John Selden, Sir Matthew Hale, and Sir William Blackstone.

Legal positivism recognizes positive law as the only real law and rejects law’s moral and historical dimensions as sources of valid laws. Natural law theory and the historicist school, on the other hand, often integrate law’s three dimensions. They recognize each dimension as a potential source of valid law but emphasize a particular dimension through their validity standard. Blackstone’s unique jurisprudence adopts two validity standards, one from law’s historical dimension, and one from law’s moral dimension.

Standards of legal validity are historically cyclical. A society typically adopts a standard of legal validity based on moral principles, custom, or both. This validity standard restricts the ruler’s ability to enforce his will through legal coercion. Then, intellectual challenges to moral principles and legal custom minimize their esteem. A new validity standard is adopted based on the will of the political ruler. Abuses of coercive powers by political rulers eventually stimulate renewed restrictions on those powers. The society adopts a revived standard of legal validity based on moral principles, custom, or both. The revived validity standard will typically endure until the memory of abuse fades, when the cycle begins again.

This cycle began with Hesiod in 700 B. C. E. and continued into the 21st Century. In common law jurisprudence, judicial acceptance of Hart’s legal positivism eroded Blackstone’s validity standards based on moral principles and custom. In civil law jurisprudence, Soviet and Nazi abuses of positivist legal systems revived validity standards based on moral principles. This essay describes the cycle of legal validity in Western law and proposes a fresh approach to legal validity to break this cycle.

The Sophists
Plato
Aristotle
Cicero
Justinian’s Corpus Juris Civilis
Aquinas
Blackstone
Bentham
Austin
Hart
Radbruch
Positivism in American Jurisprudence
A Fresh Approach
References and Further Reading

1. The Sophists

The first standard of legal validity in the Western legal tradition appears in Hesiod’s religious poem Works and Days, circa 700 B. C. E. Hesiod presents an archetypal jurisprudence that integrates law’s three dimensions. Dikê, the goddess of human justice, personifies law’s moral dimension. Dikê’s father Zeus personifies law’s political dimension. Dikê’s mother Thetis, the Titan embodiment of custom and social order, personifies law’s historical dimension.

Justice “sets the laws straight with righteousness” and distinguishes men from beasts. Divinely decreed moral principles establish the validity standard for human law and customs, and conforming laws and customs establish the nomoi (law). Just men obey the nomoi, and obedience brings peace and prosperity. Disobedience brings punishment to the individual and his city through famine, plague, infertility, and military disaster.

The Sophists, wandering teachers of the fifth century B. C. E., challenged Greek conventions in religion, morality, and political conduct. They rejected Hesiod’s moral dimension by rejecting the existence of divine lawgivers and universal moral principles. They rejected Hesiod’s historical dimension by denying any normative authority to custom. Might was right, and law functioned only in the political dimension as the will of the strongest.

The Sophist Protagoras of Abdera (b. circa 481 B. C. E.), rejected law’s moral dimension. As an agnostic, Protagoras rejected the divine lawgiver. As a moral relativist, Protagoras rejected the existence of universal moral principles. Unlike later Sophists, however, Protagoras accepted the validity of custom in law’s historical dimension.

Protagoras based his moral relativism on the argument that a shared factual knowledge of the world is impossible. The foundation of Protagoras’ relativism is the “man-measure” of the Aletheia (Truth). “Man is the measure of all things, of those that are that they are, of those that are not that they are not.”

Sense perception forms the basis of all knowledge, Protagoras believed, and every sense impression that a person receives is securely true. The data of sense perception, however, are private, subjective states. The wind is truly warm to the man who perceives it as warm, but the same wind is truly cold to the man who perceives it as cold. Perceived objects therefore have contradictory properties and there are no public facts.

Protagoras maintained that all knowledge claims are thus equally true. Furthermore, their truth endures regardless of conflicting claims. Protagoras therefore claimed “it is equally possible to affirm and deny anything of anything.” (Aristotle, Metaphysics, 1007b).

Protagoras extended his doctrine that all knowledge claims are equally true to claim that all virtue claims are equally true. Virtue claims are relative to the claimant because virtue is only another form of knowledge. (Plato, Protagoras, 323a-328d). There are no universal moral principles, and law’s moral dimension does not exist.

Although Protagoras rejected law’s moral dimension, he embraced law’s historical dimension. Although all knowledge and virtue claims are equally true, Protagoras argued they are not all equally sound. Only the ignorant equated truth with soundness. One set of thoughts can therefore be “better than another, but not in any way truer.” The same is true of laws. All laws are equally true, but not all laws are equally sound.

Protagoras accepted a duty to obey the law. Since no moral or legal code is truer than any other, no individual should assert his moral or legal judgments over those advanced by the state. Society is required to preserve humanity. The perpetuation of society, in turn, requires respect for law and custom. Men should obey the state’s laws and customs so long as they function soundly. (Plato, Protagoras, 322d; Theaetetus, 167b).

The Sophist Callicles (b. circa 484 B. C. E.), rejected law’s historical dimension and denied any duty to obey the law. Using “nature” to mean the antithesis of mind, Callicles argued that nature’s normative authority (phusis) supersedes the normative authority of man’s laws and customs (nomoi). Man’s laws and customs violate “nature’s own law” and “natural justice.” Nature’s law, not man’s, should govern our actions.

Callicles said that what men call “right” merely expresses what men believe to be to their advantage. Legal conventions in democracies wrongfully elevate the weak over the strong. The majority of weaker folk frame the laws for their advantage to prevent the stronger from gaining advantage over them. The true nature of right is established by nature, not men, and nature’s law establishes right in the strong. Natural justice provides that the better and wiser man should rule over and have more than the inferior. Might, therefore, makes right. All animals and races of man recognize right as the sovereignty and advantage of the stronger over the weaker. (Plato, Gorgias, 483b-d, 490a).

The Sophist Thrasymachus (b. circa 459 B. C. E.) argued for disobeying laws and customs. Defining justice as obedience to the laws, Thrasymachus argues that justice is nothing but the advantage of the stronger. Obedience furthers the advantage of others and reduces the obedient to a form of slavery. Only disobedience to law profits a man and leads to his advantage. Injustice is therefore “a stronger, freer, and more masterful thing than justice.” (Plato, Republic, 338c-344c).

Solon’s constitution created an archetypal positivist legal system in Athens in 594 B. C. E. Solon reposed political and judicial authority in the heliastic courts. The courts enforced undefined laws with no standard of legal validity other than the unrestrained will of the jurors. Pericles’ introduction of payments for jurors in 451 B. C. E. enthroned Athens’ poorest and least educated class as dikasts in the heliastic courts. The Athenian courts became infamous for injustice and gullibility. Xenophon writes that Athenian courts often acted on emotion to put innocent men to death and acquit wrongdoers. (Xenophon 1990, pp.41-42). Eighty dikasts who found Socrates innocent voted for his death.

Athenian ostracism (ostrakismos) permitted the conviction, exile, and execution of any Athenian without charges, hearing, or defense. Originally intended for removing tyrants, Plutarch records that ostracism quickly became a way of pacifying jealousy of the eminent. Ostracism breathed out malice in exile and death. Every one was liable to it whose reputation, birth, or eloquence rose above the common level. (Plutarch 1914, pp. 2, 230, 233).

Athens ostracized its greatest heroes from envy of their honors. Athens ostracized Aristides, the hero of the Battle of Marathon, in 483 B. C. E. Athens ostracized Themistocles, savior of Athens at the Battle of Salamis, in 471 B. C. E. Both men were exiled for ten years without charges or a hearing.

Lack of procedural safeguards encouraged frivolous public prosecutions (graphai) and impeachments (eisangeliai), giving free reign to Athens’ gullible and imprudent dikasts. Frivolous political prosecutions destroyed Athens’ leadership, spawning bloody regime changes and military disasters. The frivolous prosecution of Pericles in 443 B. C. E. precipitated the Peloponnesian War with Sparta. The frivolous prosecution of Alcibiades in 415 B. C. E. caused Athens’ ablest general to switch sides and lead Sparta against Athens.

The greatest ignominy involves the Arginusae generals in 404 B. C. E. Six Athenian naval commanders won a great naval victory against Sparta at Arginusae. A violent storm prevented their recovering the dead and shipwrecked. The generals were nevertheless impeached and executed for failing to do so. Deprived of her best generals, Athens lost the war the next year in a devastating naval defeat at Aegospotami.

Political prosecutions wreaked political havoc as well. Five regime changes rocked Athens between 411 B. C. E. and 403 B. C. E. These regimes included the reign of terror by the Thirty Tyrants in 404 B. C. E.

Athenian positivism criminalized thought and expression in frivolous prosecutions against philosophers. Anaxagoras circa 430 B. C. E., Protagoras circa 415 B. C. E., and Socrates in 399 B. C. E. were all convicted on manufactured charges of impiety (asebeia). Impiety was undefined by Athenian law. Every juror defined it anew in every case as he pleased.

Athens often regretted its decisions. Socrates’ lead accuser Anytus was stoned for his role in Socrates’ death. Athens honored Socrates with a bronze statue by Lysippus. Athens thus gained “the indelible reproach of decreeing to the same citizens the hemlock on one day and statues on the next.” (Hamilton 2010, p. 289).

2. Plato

Plato described Socrates as the bravest, wisest, and most upright man of his time. Plato planned a career in politics but “withdrew in disgust” after observing how Athenian courts “corrupted the written laws and customs.” (Plato, Letter VII, 325a-c). Plato reacted to Socrates’ death by repudiating the Sophists, reviving law’s moral and historical dimensions, and formulating a natural law standard of legal validity based on principles of universal justice.

Plato begins his revival of law’s historical dimension by emphasizing the autonomy of law, which he considered the most important aspect of government. Autonomous laws wield supremacy over political rulers. Political rulers are subject to the same laws as other citizens, and they may not alter the laws to suit their will.

Plato wrote that the preservation or ruin of a community depends on the autonomy of laws more than anything else. Respecting law’s autonomy preserves the entire community. Disregarding it brings destruction. Autonomy is so important that “the man who is most perfect in obedience to established law” should receive the highest post in government. The second most obedient man should receive the second highest post, and so on for all the posts. (Plato, Laws, 715c-d.)

Plato begins his revival of law’s moral dimension by persuasively refuting Protagoras’ moral relativism in the Theaetetus. Protagoras claimed that all sense perceptions are equally true. Since knowledge is perception, all knowledge claims are equally true. Since moral claims are a species of knowledge claims, all moral claims are equally true. Therefore, no one set of moral principles has authority to guide the laws.

Plato offers eleven objections to Protagoras’ arguments in the Theaetetus. Three are recounted here. First, Plato denies that knowledge is perception. If knowledge were perception, we would understand anyone speaking to us in a foreign tongue. This is clearly not the case. Second, remembered knowledge refutes Protagoras’ claim that knowledge is perception. Remembered knowledge involves no perception, but it is knowledge nonetheless.

Third, moral relativism is self-refuting. Assume, as Protagoras claims, that “all beliefs are true.” Assume also that another man exists who believes that “not all beliefs are true.” If Protagoras is correct, then the second man’s belief must be true. Protagoras’ belief that “all beliefs are true” is thus refuted. (Plato, Theaetetus, 160e-177b).

Plato continues his revival of law’s moral and historical dimensions in the Crito. The Crito considers whether a duty exists to obey the law. Socrates’ friend Crito argues for Socrates to escape and avoid his unjust execution.

Socrates replies that the soul is more precious than the body. Good actions benefit our souls, but wrong actions mutilate them. The important thing is not living, but living well. This means living honorably. Socrates utilizes three principles in determining whether to escape. First, circumstances never justify wrong action. Second, one should not injure others, even when they injure you. Third, one “ought to honor one’s agreements, provided they are right.” (Plato, Crito, 47e-49e).

Plato defines law’s moral dimension through these principles. Justinian’s Corpus Juris Civilis defines its moral dimension by these same principles in the sixth century. (Justinian, Digest, 1.1.10). Blackstone’s Commentaries does the same in the eighteenth century. (Blackstone 1828, p. 27).

Plato next refutes Thrasymachus’ claim in the Republic that disobeying the law “is a stronger, freer, and more masterful thing” than obeying the law. In the Crito’s “Speech of the Laws,” the Laws present two arguments for obedience. The first is the “argument from agreement.” Socrates has undertaken to live his life in obedience to Athens’ laws. Athens did not force Socrates to live in its precincts. Socrates was free to leave at any time. By choosing to stay in Athens with full knowledge of how the laws functioned, Socrates promised obedience to the laws.

The Laws’ orders are “in the form of proposals, not savage commands.” Socrates can either obey the Laws or persuade (the personification of) the Law that they are at fault. If Socrates escapes without persuading the personification of the Laws that they were at fault, he would dishonor his agreement to obey the laws. Dishonoring a just agreement violates the ethic of “living well” and damages the soul.

The Laws’ second argument is the “argument from injury.” Disobedience destroys both the Laws and the city, which cannot exist if legal judgments are ignored. Socrates concludes that “both in war and in the law courts and everywhere else you must do whatever your city and your country command, or else persuade them in accordance with universal justice” that they are at fault.

The Laws’ second argument implies a natural law standard of validity based on principles of universal justice. The Laws insist they operate as “proposals, not savage commands.” Socrates’ duty to obey the Laws is contingent on the Laws’ compliance with principles of universal justice. By implication, there is no duty to obey the Laws if they violate principles of universal justice. (Plato, Crito, 51e-52d).

3. Aristotle

Aristotle designs his legal philosophy to avoid the catastrophes described in his Athenian Constitution. Aristotle accepts the necessity of law’s political dimension because laws cannot enforce themselves. Nevertheless, the Athenian legal history proves the political dimension is not sufficient to preserve a society or achieve its happiness.

Human nature demands more than political power from law. Law must accomplish justice and foster virtue. Justice is required to prevent revolution, and virtue is required for human happiness. Man separated from justice is “the worst of animals,” and man without virtue “is the most unholy and the most savage of animals.” (Aristotle, Politics 1253a).

Aristotle writes in the Politics that securing justice is the state’s most important function. Justice is more essential to the state than providing the necessities of life. Governments must be founded on justice to endure. Governments that rule unjustly and give unequal treatment to similarly placed subjects provoke revolutions. Justice maintained, however, forms a bond between the members of society that preserves the state. (Aristotle, Politics 1328b, 1332b, 1253a).

Aristotle’s Nicomachean Ethics defines justice as lawfulness concerned with the common advantage and happiness of the political community. Aristotle distinguishes between legal justice (to nomikon dikaion) and natural justice (physikon dikaion). Legal justice involves positive laws and custom enacted by man, such as conventional measures for grain and wine. These “are just not by nature but by human enactment” and “are not everywhere the same.”Aristotle secures legal justice by granting autonomy to law and by utilizing custom to encourage obedience. (Aristotle, Nicomachean Ethics, 1134b-1135a).

Natural justice, on the other hand, involves principles of natural law that originate in nature. Such principles do not arise in the minds of men “by people’s thinking this or that.” Natural law principles apply with equal force everywhere, just as fire burns both in Greece and in Persia. Aristotle secures natural justice by adopting natural law precepts as the standard of legal validity. Positive laws that violate natural law precepts are nullified. (Aristotle, Nicomachean Ethics, 1134b).

Aristotle secures legal justice by restricting the will of the political ruler through autonomous laws. The Politics teaches that unrestrained power produces tyranny, even in democracies. Aristotle considers whether societies function best under the “rule of men” or the “rule of law.” He concludes that laws, when good, should be supreme. Political rulers should merely complement the law by acting as its guardians and ministers. They should only regulate those matters on which the laws are unable to speak with precision owing to the difficulty of any general principle embracing all particulars. (Aristotle, Politics, 1282b).

Aristotle gives four reasons for emphasizing law’s autonomy over the will of the political ruler. First, law frees the state from the desires and passions that afflict political rulers. “The law is reason unaffected by desire. Desire … is a wild beast, and passion perverts the minds of rulers, even when they are the best of men.” (Aristotle, Politics, 1287a). Second, tyranny results when political rulers exercise autonomy over law, even in democracies. Third, the orderly rotation of political offices requires autonomous laws. Equality, liberty, justice, and expediency mandate that every mature citizen participates in governing the state. Fourth, the orderly rotation of political offices preserves the state by assuring evenhanded administration by magistrates.

Aristotle utilizes law’s historical dimension to secure legal justice through custom. Aristotle uses the term nomos for law, and nomos includes custom and convention as components of the social norm. Aristotle writes in the Politics that legal custom is itself a form of justice. Custom and convention maintain social stability by encouraging obedience to the law. The law has no power to command obedience except that of habit, which can only be given by time. Aristotle urges caution in changing the law because changes enfeeble the power of the law. If the advantage of a change is small, it is wiser to leave errors in the law. The citizens usually lose more by the habit of disobedience than they gain by changing the law. (Aristotle, Politics, 1255a, 1269a).

Aristotle utilizes law’s moral dimension to secure natural justice in two ways. The first is by nullifying positive laws that subvert natural law precepts. Aristotle formulates a natural law standard of legal validity. Aristotle’s Rhetoric describes natural law as an unwritten law, based on nature, and common to all people. “There is in nature a common principle of the just and unjust that all people in some way divine.” (Aristotle, Rhetoric, 1373b).

Natural law provides immutable and universal standards of justice. Natural law constitutes a separate body of binding law that exceeds positive law in authority. Human actions should complete nature rather than subvert it, and natural law nullifies positive laws that subvert natural law precepts. (Aristotle, Rhetoric, 1373b).

Like Plato, Aristotle argues that the universal standards of natural law justify disobeying positive laws. Aristotle’s Rhetoric provides two examples invalidating positive law for violating natural law precepts. The first is the case of Sophocles’ Antigone, where Antigone disobeys Creon’s order and provides funeral rites to her brother Polyneices. The second is Aristotle’s guide to jury nullification of written law by appealing to higher principles of natural law. (Aristotle, Rhetoric, 1373b, 1375a-b).

Aristotle never explains why natural law wields supremacy over positive law. The supremacy of natural law is consistent, however, with Aristotle’s view in the Physics that the ultimate causes of nature are divine. (Aristotle, Physics, 198b-199b).

The second way that Aristotle secures natural justice is by fostering virtue. Aristotle believed that human happiness depended on virtue more than liberty. The government is thus responsible for producing a virtuous state, and this is best accomplished through law. Although virtue encompasses more than mere conformity to law, virtue will only develop and flourish in a state that supports the legal enforcement of virtue. The state must provide moral education through its laws to make its citizens just and good. Failing to do so undermines the state’s political system and harms its citizens. (Aristotle, Nicomachean Ethics, 1179b; Politics, 1280b, 1310a, 1337a).

4. Cicero

Marcus Tullius Cicero (106-43 B. C. E.) was a politician, philosopher, orator, and attorney. Cicero’s De Legibus (The Laws), De Officis (On Duties), and De Re Publica (The Republic) greatly influence the natural law tradition. Cicero esteemed Plato and Aristotle. Although not a Stoic, Cicero adopted Stoicism’s divine Nature as the source of natural law precepts that dictate legal validity. The histories of Herodotus, Thucydides, Xenophon, and Polybius persuaded Cicero that natural law imposes justice on human events.

Cicero’s signature contribution to jurisprudence is his explication of Nature as divine lawgiver. Law and justice originate in Nature as a divinely ordained set of universal moral principles. Cicero describes Nature as the omnipotent ruler of the universe, the omnipresent observer of every individual’s intentions and actions, and the common master of all people. Belief in divine Nature stabilizes society, encourages obedience to law, and leads to individual virtue. (Cicero, De Legibus, 2.15-16).

Law’s moral dimension dominates Cicero’s jurisprudence. Cicero defines natural law as perfect reason in commanding and prohibiting. These principles are the sole source of justice and provide the sole standard of legal validity. “True law is right reason in agreement with Nature.” (Cicero, De Re Publica, 3.33).

The precepts of natural law are eternal and immutable. They apply universally at all places, at all times, and to all people. Natural law summons to duty by its commands, and averts from wrongdoing by its prohibitions. Nature serves as the enforcing judge of natural law precepts, and Nature’s punishment for violating natural law precepts is inescapable. (Cicero, De Re Publica, 3.33).

Natural law provides the naturae norma, the standard of legal validity for positive law and custom. The naturae norma provides the only means for separating good provisions from bad. Justice entails that laws and customs comply with the naturae norma and preserve the peace, happiness, and safety of the state and its citizens. Positive laws and customs that fail to do so are not regarded as laws at all. (Cicero, De Legibus, 1.44, 2.11-2.14).

Regarding Cicero’s political dimension of law, the magistrate’s limited role is to govern and to issue orders that are just and advantageous in keeping with the laws. Although the magistrate has some control of the people, the laws are fully in control of the magistrate. An official is the speaking law, and the law is a nonspeaking official. (Cicero, De Legibus, 3.2).

Political rulers cannot alter, repeal, or abolish natural law precepts. Furthermore, political rulers have no role in interpreting or explaining natural law precepts. Every man can discern the precepts of natural law for himself through reason. (Cicero, De Re Publica, 3.33).

Political rulers must issue just commands as measured by natural law precepts. Individuals are protected against unjust coercion. Although rulers may use sanctions to enforce legitimate commands, every affected subject has the right to appeal to the people before enforcement of any sanction. Furthermore, no ruler can issue commands concerning single individuals. Any significant sanction against an individual, such as execution or loss of citizenship, is reserved to the highest assembly of the people. As a further protection, all laws must be officially recorded by the censors. (Cicero, De Re Publica, 2.53-2.54; De Legibus, 3.10-3.47).

Like Aristotle, Cicero requires that magistrates be subject to the power of others. Successive terms are forbidden, and ten years must pass before the magistrate becomes eligible for the same office. Every magistrate leaving office must submit an account of his official acts to the censors. Misconduct is subject to prosecution. No magistrate may give or receive any gifts while seeking or holding office, or after the conclusion of his term. (Cicero, De Legibus, 3.9-3.11).

Regarding Cicero’s historical dimension of law, Cicero agrees with Aristotle that custom maintains social stability by encouraging obedience to law. Custom can even achieve immortality for the commonwealth. The commonwealth will be eternal if citizens conduct their lives in accordance with ancestral laws and customs. (Cicero, De Re Publica, 3.41).

5. Justinian’s Corpus Juris Civilis

The Corpus Juris Civilis (Body of Civil Law) codified Roman law pursuant to the decree of Justinian I. Completed in A.D. 535, the four works of the Corpus became the sole legal authorities in the empire. The Institutes was a law school text. The Codex contained statutes dating from A.D. 76. The Digest contained commentaries by leading jurists, and the New Laws was supplemented as new laws became necessary.

The Corpus is the direct ancestor of modtern Wester civil law systems. Its influence on canon law is seen in the medieval maim Ecclesia vivit lege romana (the Church lives on Roman law). Common law jurisprudence never accepted the Corpus as binding authority. Nevertheless, its twelfth century revival profoundly influenced the formation of common law jurisprudence through the works of the father of the common law, Henry de Bracton (C. E. 1210 – C. E. 1268).

The Corpus divides law into public law involving state interests and private law governing individuals. Private law is a mixture of natural law, the law of nations, and municipal law. The Corpus establishes a clear hierarchy among law’s three dimensions. The moral dimension occupies the highest position and provides the standard of legal validity. The historical dimension of legal custom occupies the second position, and the political dimension of Roman municipal law occupies the lowest position.

The Corpus’ moral dimension resides in two bodies of law, natural law and the law of nations. Like Cicero, the Corpus originates natural law in a divine lawgiver. “The laws of nature, which are observed by all nations alike, are established by divine providence.” The precepts of natural law are universal, eternal, and immutable. (Justinian, Institutes, 1.2.11; Digest, 1.3.2).

Natural law governs all land, air, and sea creatures, including man. “The law of nature is that which she has taught all animals; a law not peculiar to the human race, but shared by all living creatures.” The Corpus extends natural law to “all living creatures” to repudiate the Sophist arguments that law is merely a human convention with no basis in nature, justice does not exist, and there is no duty to obey law. The Corpus‘ rebuttal focuses on the highly socialized behavior of such animal species as ants, bees, and birds. Although animals cannot legislate or form social conventions, they nevertheless follow norms of behavior. These norms affirm the existence of natural law. (Justinian, Institutes, 1.1.3, 2.1.11).

The Institutes and the Digest state three precepts of natural law: “Honeste vivere, alterum non laedere, suum cuique tribuere.” Live honorably, injure no one, and give every man his due. (Justinian, Institutes, 1.1.3; Digest, 1.1.10). These precepts track the Crito’s admonishments to live well, harm no one, and honor agreements so long as they are honorable. (Plato, Crito, 47e-49e). Blackstone’s Commentaries adopts these exact precepts. (Blackstone 1828, p. 27).

The law of nations is the portion of natural law that governs relations between human beings. (Justinian, Digest, 1.4). Its rules are “prescribed by natural reason for all men” and “observed by all peoples alike.” The law of nations is the source of duties to God, one’s parents, and one’s country. It recognizes human rights to life, liberty, and self-defense, and its recognition of property rights enables contracts and commerce between peoples.

The precepts of natural law provide the standard for legal validity. This standard voids any right or duty violating natural law precepts. The Institutes provides illustrative examples: Contracts created for immoral purposes, such as carrying out a homicide or a sacrilege, are not enforceable. (Justinian, Institutes, 3.19.24). Immorality invalidates wrongful profits. Anyone profiting from wrongful dominion over another’s property must disgorge those profits.(Justinian, Digest, 5.3.52).

Immorality invalidates agency relationships. Agents are not obliged to carry out immoral instructions from their principals. If they do, they are not entitled to indemnity from their principals for any liability the agents incur. (Justinian, Institutes, 3.26.7). Immorality even invalidates bequests and legacies if the bequest is contingent upon immoral conduct.(Justinian, Institutes, 2.20.36).

The Corpus’ historical dimension provides custom as a source of enforceable law. The Corpus defines legal custom as the tacit consent of a people established by long-continued habit. Since custom evidences the consent of the people, it is a higher source of law than positive or statutory law.Statutory provisions, if customarily ignored, are treated like repealed legislation. (Justinian, Digest, 1.1.3).

Legal custom establishes the autonomy of law over political rulers. Custom binds judges. A judge’s first duty is “to not judge contrary to statutes, the imperial laws, and custom.” Legal custom even controls statutory interpretation. “Custom is the best interpreter of statutes.” (Justinian, Institutes, 4.17; Digest, 1.1.37).

The Corpus’ political dimension resides in its six categories of Roman municipal law, the “statutes, plebiscites, senatusconsults, enactments of the Emperors, edicts of the magistrates, and answers of those learned in the law.” In contrast to natural law and the law of nations, Roman municipal law was unique to Rome. Its provisions were also “subject to frequent change, either by the tacit consent of the people, or by the subsequent enactment of another statute.” (Justinian, Institutes, 1.2.3, 1.2.11).

6. Aquinas

Thomas Aquinas‘ Summa Theologica recognizes all three dimensions of law as potential sources of valid law. The moral dimension wields supremacy, however, through a rigid standard of legal validity. Human laws that fail this standard are not merely unenforceable; they are “perversions of law,” “acts of violence,” and “no law at all.” (Aquinas, Summa Theologica, quest. 94 art. 4; quest. 95 art. 2).

Common law jurisprudence has never accepted Aquinas’ natural law theory. It differs in important ways from Blackstone’s natural law theory. Thomism nevertheless influenced the philosophical method taught in Roman Catholic institutions. Martin Luther King Jr. invoked Aquinas’ natural law theory in the Birmingham jail to justify civil disobedience, and Aquinas’ theory motivates contemporary opponents of abortion and euthanasia.

Question 97 establishes both God and man as lawgivers. Divine and natural law come from the rational will of God. Human law comes from the will of man, regulated by reason. (Aquinas, Summa, quest. 97 art. 3).

Question 90 defines four existence conditions for law. The first condition is that law is an ordinance of reason, that law is created by a being with reason to achieve a goal. The second condition is that the law has the common good as its goal and that laws must distribute their burdens equitably and proportionately among their subjects. The third condition is a lawgiver who has care of the community because unless the lawgiver holds sufficient power to coerce obedience, the law cannot induce its subjects to virtue. The fourth condition is publication, which is required for law to have the binding force to compel obedience. Each condition is necessary for law, and together they are sufficient. Failing any condition renders a purported law an act of violence. (Aquinas, Summa, quest. 96 art. 1-4).

Question 91 divides law into four types. Eternal law is the set of timeless truths that govern the movement and behavior of all things in the universe, including human beings. Divine law is the word of God revealed to man to guide him to his supernatural end. God reveals divine law to operate because human reason is inadequate to discover its precepts. Natural law is that portion of the eternal law that governs the behavior of human beings. Natural law is derived from eternal law, and its precepts are discovered by reason. Human law is any law of human authorship. Man creates human law in order to implement the precepts of natural law. (Aquinas, Summa, quest. 91 art. 1-4).

Question 94 presents Aquinas’ theory of natural law. God writes natural law in the hearts of men, and man discerns the natural law using practical reason. Four natural inclinations enable man to discern the precepts of natural law. The first is an inclination to seek after good. The second is an inclination to preserve one’s own according to one’s nature. Man shares these first two inclinations with all substances. The third is an inclination to reproduce, raise, and educate one’s offspring. Man shares this inclination with animals. The fourth is an inclination “to know the truth about God and to live in society.” This inclination is unique to man. (Aquinas, Summa, quest. 94 art. 2).

Aquinas divides natural law into “first principles” and “secondary principles.” First principles are unchanging. They are always known by all human beings and they are binding on all human beings. They are mutually consistent, and conflict between them is impossible. They cannot be “blotted out from men’s hearts.” (Aquinas, Summa, quest. 94 art. 6).

The first principles of natural law contain four precepts, each reflecting one of man’s natural inclinations. The first precept is to pursue good and avoid evil. The second is to preserve life and ward off its obstacles. The third is to reproduce, raise, and educate one’s offspring. The fourth is to pursue knowledge and to live together in society. (Aquinas, Summa, quest. 94 art. 2).

Secondary principles of natural law differ significantly from first principles. Secondary principles are subject to change, albeit rarely and for special causes. They are not always known by all persons and they are not always binding. These differences result from practical reason’s susceptibility to perversion by passion, evil habits, and evil dispositions. Lastly, secondary principles can be blotted out from men’s hearts through “evil persuasions,” errors in “speculative matters,” vicious customs,” and “corrupt habits.” (Aquinas, Summa, quest. 94 art. 6).

Secondary principles form three categories. The first involves secondary principles that are always known by all persons and are always binding, such as “do not murder or slay the innocent.” The second category involves principles that are always binding but not always known, such as “do not steal.” Julius Caesar reports in the Gallic Wars, for example, that the Germans did not know it was wrong to steal. The third category involves principles that are not always binding, such as “goods entrusted to another should be restored.” Although usually binding, this principle does not bind the return of another’s weapons to be used against one’s country. (Aquinas, Summa, quest. 94 art. 4).

Questions 95 through 97 discuss human law. Human law exists because the great variety of human affairs prevents the first principles of natural law from being applied to all men in the same way. Human reason derives human law from natural law precepts for particular matters, and this process creates a diversity of positive law among different peoples. The “force” accorded to human law depends on the method by which it is derived from natural law. (Aquinas, Summa, quest. 95 art. 2).

Aquinas specifies two methods. The first method involves taking a “conclusion” from a premise of natural law. As in science, reason draws specific conclusions of human law by demonstration from natural law principles. Reason demonstrates the human law conclusion that “one must not kill” from the natural law principle that “one should do harm to no man.” Human laws derived by this method have some force of natural law. (Aquinas, Summa, quest. 95 art. 2).

The second method for deriving human law involves making a “determination” from generalities of natural law. As in the arts, details are derived from general forms. A carpenter begins with the general form of a house in his mind, but he must determine the details of its construction as he builds it. Reason determines that murderers should be imprisoned for twenty years from the natural law principle that evildoers should be punished. Unlike conclusions of human law, determinations have no force of natural law. (Aquinas, Summa, quest. 95 art. 2).

Question 96 provides a narrow scope for human law. Human laws should not repress all the vices forbidden by natural law. Since most people are incapable of abstaining from all vices, human law should only prohibit those vices whose suppression is essential for preserving society. Human laws should prohibit murder and theft but remain silent as to lesser vices. (Aquinas, Summa, quest. 96 art. 2).

The Summa provides a fully developed standard of legal validity. Question 96 provides that human laws must be just. Justice requires that human laws accomplish both divine good and human good as described below. Unjust laws are not merely unenforceable; they are perversions of law and acts of violence, and they are powerless to bind the conscience. They are, in fact, not laws at all. (Aquinas, Summa, quest. 96 art. 4).

Human laws accomplish divine good by satisfying the requirements of natural law and divine law. Purported laws that conflict with divine good, natural law or divine law should always be disobeyed. (Aquinas, Summa, quest. 96 art. 4).

Human laws accomplish human good if and only if they meet three conditions. First, the end of the law must be the common good. Second, the human lawgiver must not exceed his power in establishing the law. Third, the burdens of the law must be shared equitably and proportionately by all members of society. Failure to meet any of these conditions renders the purported law unjust. (Aquinas, Summa, quest. 96 art. 4).

Purported laws that conflict with human good are unjust and may usually be disobeyed. If the purported law fails to meet one of the standards for human good, it may be disobeyed. An exception arises, however, if disobedience results in “greater harm” or creates a scandal. The unjust human law should then be obeyed, even though it is not truly a law. (Aquinas, Summa, quest. 96 art. 4).

Critics often charge that Aquinas’ claim that “an unjust law is no law at all” is incoherent. This criticism seemingly disregards Aquinas’ definition of law in Question 95. Laws have “just so much of the nature of law” as they are derived from natural law. Natural law is always just. To be considered law “at all,” therefore, human laws must be just. A purported law that is unjust is not truly a law. (Aquinas, Summa, quest. 95 art. 2).

7. Blackstone

Sir William Blackstone’s Commentaries on the Laws of England is the standard statement of common law jurisprudence. Blackstone imposes two standards of legal validity, one based on custom and the other on natural law. Purported laws that fail these standards are not merely “bad law,” they are “not law.” (Blackstone 1838, p. 47).

Law’s historical dimension provides the validity standard based on custom and serves as the primary source of human law. The historical dimension also emphasizes the autonomy of custom over the will of political rulers. Law’s moral dimension provides the validity standard based on natural law. The moral dimension also establishes natural rights as limits on the will of the political ruler and protects these rights through due process. The political dimension provides only a limited source of law, and the historical and moral dimensions severely restrict the political ruler’s ability to enforce his will through legal coercion.

Law’s historical dimension dominates Blackstone’s jurisprudence. Custom is “the first ground and chief corner stone” of common law. Custom includes rules of law, such as the rule of primogeniture, which says the oldest male descendant inherits the entire estate. Custom also includes legal principles in the forms of maxims, such as “the king can do no wrong,” “no man is bound to accuse himself,” and “no man ought to benefit from his own wrong.” Law’s historical dimension is so strong in common law that approved statutes were strictly construed and interpreted whenever possible to comply with pre-existing custom. (Blackstone 1838, pp. 46, 50).

Blackstone divides customary law into three types. The first type, “general customs,” applies to the entire kingdom. The second type, “particular customs,” only apply to limited regions or specialized groups like merchants. For illustration, the “general custom” of inheritance for England is primogeniture where the eldest son inherits all. Nevertheless, the “particular custom” of gavelkind permits shared inheritance in Kent. The third type, “peculiar laws,” includes Roman civil law and Catholic canon law. These laws have no authority in England except as the people have consented to their provisions through customary observance. (Blackstone 1838, pp. 45-57).

The validity standard for custom includes seven requirements. First, the custom must “have been used so long, that the memory of man runs not to the contrary.” Proof of any time when the custom did not exist voids the custom. Second, the custom must enjoy continuous observance, interruption voids the custom. Third, the custom must enjoy peaceable observance. Custom depends upon consent, and disputed customs lack consent. Fourth, customs must be “reasonable” and must not create unnecessary hardships.Fifth, the custom must be certain. A custom that the worthiest son inherits is void because no certain standard for worthiness exists. Sixth, compliance must be mandatory. Optional customs have no coercive force. Lastly, customs must be consistent. Inconsistent customs lack mutual consent. (Blackstone 1838, pp. 53-55).

Law’s moral dimension provides a standard of legal validity based on natural law. Blackstone’s natural law founds justice on the eternal and immutable laws of good and evil to which the creator himself conforms. God is a being of infinite power, infinite wisdom, and infinite goodness. Although God endows man with reason and free will, man is still “entirely dependent” on God. Man is subject to God’s law, and God’s law is natural law. Natural law is binding over the entire globe, in all countries, and at all times. No human laws are of any validity if they conflict with natural law, and valid human laws derive all their force and authority from natural law.

Natural law precepts are discernible by reason as far as they are necessary for the conduct of human actions. Unlike Aquinas, however, Blackstone regards human reason as “frail, imperfect, and blind” since man’s fall. To overcome these defects of human reason, God reveals the precepts of natural law through direct revelation in scripture. The validity of human law depends on the two foundations of natural law and revealed law. Human laws contradicting their precepts are void.

Natural law permits acts that promote true happiness and prohibits acts that destroy it. Natural law derives from the precept “that man should pursue his own true and substantial happiness.” God created human nature so that man obtains happiness by pursuing justice. Injustice brings unhappiness.

Substantively, natural law consists of eternal immutable laws of good and evil. Blackstone adopts three precepts of natural law from Justinian’s Institutes. “Such, among others, are these principles: that we should live honestly, should hurt nobody, and should render to every one his due; to which three general precepts Justinian has reduced the whole doctrine of law.” (Blackstone 1838, pp. 27-28).

Blackstone divides jurisprudence into natural law and positive law. Positive law provisions contrary to natural law are invalid. Individuals are furthermore bound to disobey them, such as laws requiring murder. Nevertheless, natural law does not determine every legal issue. Natural law is indifferent, for example, as to whether positive law permits the export of wool. On most issues, man is at liberty to adopt positive laws that benefit society. (Blackstone 1838, pp. 28-29).

Blackstone divides rights into two types, absolute rights and relative rights. The “immutable laws of nature” vest absolute rights in individuals. Individuals enjoy absolute rights in the state of nature, prior to the formation of society. (Blackstone 1838, pp. 88, 94).

Blackstone names three absolute rights: personal security, personal liberty, and private property. The absolute right of personal security consists of the legal enjoyment of life, limb, body, health, and reputation. The absolute right of personal liberty consists of the free power of movement without imprisonment or restraint unless by due course of law. The absolute right of property consists of the free use and disposal of lawful acquisitions, without injury or illegal diminution. (Blackstone 1838, pp 93-100).

Relative rights, in contrast to absolute rights, exist only in society. Relative rights protect and maintain inviolate the three absolute rights of personal security, personal liberty, and private property. Unlike absolute rights, which are few and simple, relative rights are more numerous and more complicated. Such rights include due process protections as well as “Blackstone’s ratio,” which says it is better that ten guilty persons escape than one innocent party suffers. (Blackstone 1838, pp. 89, 102).

Law’s political dimension is severely delimited in Blackstone’s jurisprudence. Society is formed for the protection of individuals. In addition to the validity standards discussed above, Blackstone’s historical dimension dictates a near absolute standard of legal autonomy. Law wields supremacy over the will of political rulers, whether they are kings or judges. (Blackstone 1838, p. 32).

Regarding the autonomy of law over kings, the most important maxim in English history is “the law makes the king; the king does not make the law.” This maxim dates from Henry de Bracton’s 1235 treatise The Laws and Customs of the Kingdom of England. “The king must not be under man but under God and under the law, because the law makes the king … there is no king where the will and not the law has dominion.” (De Bracton 1968, p. 33).

Regarding the autonomy of law over judges, Blackstone’s “declaratory theory” prohibits judges from making new law. Judges may only find and declare existing law; they may never make law. Judge-made law unites the power to make and enforce law in one body, and this invites tyranny. The judge should determine the law according to the known laws and customs of the land, not his own private judgment. Judges are not appointed to pronounce new laws. (Blackstone 1838, p. 46, 105).

Nevertheless, since all law is subject to the standard of reason, judges may set aside common law precedents that are contrary to reason as “manifestly absurd or unjust.” Setting unreasonable precedents aside does not create new law. Instead, it vindicates the law from misrepresentation. Unreasonable rules of common law, by definition, are not law. Such precedents are not set aside because they are bad law, but because they are not law. (Blackstone 1838, pp. 46-47).

In applying statutory law, however, the judge may never exercise his discretion to set aside the will of Parliament. The only authority that can declare an act of Parliament void is Parliament itself. The judge must “interpret and obey” its mandates. Judges may never act as miniature legislatures. “In a democracy,” writes Blackstone, “the right of making laws resides in the people at large.” (Blackstone 1838, pp. 27, 33).

8. Bentham

Legal positivism rejects law’s moral and historical dimensions as sources of law or standards of legal validity. H. L. A. Hart is the most important figure in the positivist tradition that begins with Jeremy Bentham and John Austin. Bentham was sixteen when he attended a series of private lectures by Blackstone on the common law. These lectures were later published as Blackstone’s Commentaries.

The young Bentham listened with rebel ears. Bentham’s anonymous Fragment on Government describes Blackstone’s natural law theory as “theological grimgribber” and an “excursion into the land of fancy.” Bentham describes Blackstone as “the dupe of every prejudice,” “the accomplice of every chicanery,” “the abettor of every abuse,” and “a treasury of vulgar errors.” (Bentham 1977, 10).

Bentham’s legal theory has two distinctive features. The first is Bentham’s exclusion of law’s historical dimension. Bentham’s “imperative” theory of law defines law as (1) the assemblage of signs of a sovereign’s volition, (2) directing the conduct of persons under his power, (3) accompanied by an “expectation” in such persons, that (4) motivates obedience. The sovereign’s will provides its own validity standard. Custom is excluded and the ruler wields autonomy over law. (Bentham 1970, p. 1).

Bentham’s second distinctive feature is his exclusion of law’s moral dimension. Law for Bentham has no necessary conceptual connection with morality. Bentham abandons Blackstone’s immutable standards of right and wrong for physical sensations of pleasure and pain: “Nature has placed mankind under the governance of two sovereign masters, pain and pleasure. It is for them alone to point out what we ought to do.” (Bentham 1907, p. 1).

Bentham’s Anarchical Fallacies argues that natural laws and natural rights are imaginary. “Natural rights is simple nonsense: natural and imprescriptable rights, nonsense upon stilts.” Positive law is the only real law. Only positive law can create real rights, and positive law requires the existence of a sovereign. There can be no rights outside the existence of a sovereign command, and no rights can exist prior to the formation of a government. In sum, the will of the sovereign provides its own standard of legal validity, unrestrained by morality, custom, or the autonomy of law. (Bentham 1843, pp. 501-05).

9. Austin

John Austin‘s The Province of Jurisprudence Determined defines law’s political dimension as the sole source of law and legal validity. Like Bentham’s “imperative” theory, Austin’s “command” theory of law establishes the political ruler’s will as its own standard of legal validity. The sovereign can coerce his will through law without restraint by moral principles, custom, or the autonomy of law.

Austin’s “command” theory defines law as (a) commands, (b) backed by threat of sanctions, (c) from a sovereign, (d) to whom people have a habit of obedience. A common criticism of Austin’s theory is that the command of a gun-wielding highwayman arguably satisfies Austin’s definition of law.

The “command” theory rejects law’s historical dimension. Legal customs and principles play no part in law. Law wields no autonomy over the political ruler’s will, including the will of judges. In contrast to Blackstone, Austin encourages judges to legislate from the bench. Society cannot function unless judges are free to make new law to correct the negligence and incapacity of legislatures. (Austin 2000, p. 191, 225-31).

Austin’s “command” theory rejects law’s moral dimension as well. Austin labels Blackstone’s natural law validity standard “stark nonsense.” God’s law is uncertain, and Blackstone’s natural law standard preaches anarchy. Austin writes that “the existence of law is one thing; its merit and demerit another. Whether it be or be not is one enquiry; whether it be or be not conformable to an assumed standard, is a different enquiry. A law, which actually exists, is a law, though we happen to dislike it.” (Austin 2000, p. 184).

10. Hart

Hart’s 1957 lecture “Positivism and the Separation of Law and Morals” emphasizes three doctrines asserted by Bentham and Austin. The first, which Hart retains, is an emphasis on “the meaning of the distinctive vocabulary of the law.” The second doctrine, which Hart retains, is the separation of law and morals. Hart holds law “as it is” distinct from law “as it ought to be.” This distinction rejects moral standards as the test for legal validity. (Hart 1958, pp. 594, 601).

The third doctrine, which Hart rejects, is Austin’s command theory of law. Hart rejects Austin’s theory for four reasons. First, Austin fails to recognize that laws generally apply to those who enact them. Second, Austin does not account for laws granting public powers, such as the power to legislate or adjudicate, or for laws granting private powers to create or modify legal relations. Third, Austin fails to account for laws that originate, not from a sovereign, but out of common custom. Fourth, Austin fails to account for the continuity of legislative authority characteristic of a modern legal system. (Hart 1994, p. 70).

Hart replaces Austin’s “command” theory with a model of law as the union of primary and secondary social rules. A primary rule is a rule that imposes an obligation or a duty. “[P]rimary rules are concerned with the actions that individuals must or must not do,” such as restrictions on “violence, theft, and deception.” A rule imposes an obligation or duty when the demand for conformity is insistent and the social pressure brought to bear upon those who deviate from the rule is great. (Hart 1994, pp. 91, 94).

In order for a system of primary rules to function effectively, Hart states that secondary rules may also be necessary to provide an authoritative statement of all the primary rules. In contrast to primary rules, which impose obligations and duties, secondary rules confer powers to introduce, to change, or to modify a primary rule. These powers may be public or private. (Hart 1994, pp. 96-97).

There are three types of secondary rules. The first type is the rule of change. This rule allows legislators to make changes in the primary rules if the primary rules are defective or inadequate. The second type is the rule of adjudication. This rule enables courts to resolve disputes regarding the interpretation and application of primary rules. The third type of secondary rule is the rule of recognition. The rule of recognition provides “a rule for conclusive identification of the primary rules of obligation.” It also provides Hart’s criterion for legal validity. A rule of law is legally valid if it conforms to the requirements of the rule of recognition. (Hart 1994, pp. 95-98, 103-05).

Hart next turns from defining the validity criteria for individual laws to defining the validity criteria for entire legal systems. System validity is determined by the attitudes of citizens and public officials toward obedience to legal rules. Hart describes two contrasting attitudes, the “external” and “internal” points of view.

The external point of view is the view of a person who feels no obligation to follow the law. He has no sense that it is right to follow the law or wrong not to do so. He rejects law as the standard of conduct for himself or others. The internal point of view, on the other hand, is the view of a person who feels obligated to follow the law. He follows the law because he thinks it is right to do so and wrong not to do so. He feels that he ought, must, and should follow the law. (Hart 1994, pp. 56-57).

The validity of a legal system depends on only two conditions. First, private citizens must generally obey the primary rules of obligation. It is sufficient that citizens take an external point of view toward primary rules. Second, public officials must adopt the rule of recognition specifying the criteria for legal validity as their “public standard of official behavior.” It is a minimum, necessary condition that officials take the internal point of view toward secondary rules. (Hart 1994, pp. 116-17).

Hart’s standard of legal validity functions solely in law’s political dimension. The will of the political rulers determines the validity of law by their adoption of a rule of recognition. The will of the political rulers determines the validity of the legal system as well. The only necessary condition for a valid legal system is the political rulers’ adoption of the internal point of view.

Hart excludes the historical dimension from his standard of legal validity. Hart omits, for example, two of the historical dimension’s traditional restraints on the will of the political ruler. The first, emphasized since Aristotle, is the autonomy of law over political rulers. Instead, Hart’s political rulers wield autonomy over law by controlling the standard of legal validity. Hart also grants judges autonomy over law by rejecting Blackstone’s declaratory theory that judges find but do not make law. If the judge determines the meaning of a legal rule to be “indeterminate or incomplete,” the judge “must exercise his discretion and make law for the case instead of merely applying already pre-existing settled law.”

The second historical restraint, emphasized by Locke and Blackstone, is the validity requirement of consent by the governed. Consent is irrelevant to Hart’s legal validity. It is sufficient that each member of the population obeys Hart’s primary rules “from any motive whatsoever.” “Any motive,” as Hart’s critics point out, includes terror and force.

Hart also excludes law’s moral dimension from his standard of legal validity. Hart accepts “morally iniquitous” laws as legally valid. “There are no necessary conceptual connections between the content of law and morality; and hence morally iniquitous provisions may be valid as legal rules or principles. One aspect of this form of the separation of law from morality is that there can be legal rights and duties which have no moral justification or force whatever.” (Hart 1994, p. 268).

11. Radbruch

Gustav Radbruch utilizes legal history to support a validity standard invoking law’s moral dimension. Radbruch, once Germany’s leading positivist, argues that the positivist separation of law and morality facilitated Hitler’s atrocities through legal means. Radbruch argues that German positivism rendered “jurists and the people alike defenseless against arbitrary, cruel, or criminal laws, however extreme they might be. In the end, the positivistic theory equates law with power; there is law only where there is power.” (Radbruch 2006b, p. 13). Positivism, in other words, operates only in law’s political dimension.

Radbruch blames the positivistic legal thinking that held sway over German jurists for rendering impotent every possible defence against the abuses of National Socialist legislation. Radbruch warns, “We must arm ourselves against the recurrence of an outlaw state like Hitler’s by fundamentally overcoming positivism.” Radbruch’s solution is a standard of legal validity invoking law’s moral dimension. (Radbruch 2006a, p. 8).

This validity standard, known as “Radbruch’s Formula,” has been applied by German courts. In cases where the discrepancy between justice and statutory law becomes “unbearable,” the statute is held void ab initio in the interest of justice. “Radbruch’s Formula” holds such statutes void ab initio because they are not truly laws.

Radbruch explains: “Where there is not even an attempt at justice, where equality, the core of justice, is deliberately betrayed in the issuance of positive law, then the statute is not merely ‘flawed law’, it lacks completely the very nature of law. For law, including positive law, cannot be otherwise defined than as a system and an institution whose very meaning is to serve justice. Measured by this standard, whole portions of National Socialist law never attained the dignity of valid law.” (Radbruch 2006a, p. 7). Radbruch thus joins Cicero, Aquinas, and Blackstone in concluding that unjust laws are not laws at all.

12. Positivism in American Jurisprudence

Hart’s separation of law from morality stimulated significant criticism in the United States. Lon Fuller’s The Morality of Law argues that law is subject to an internal morality consisting of eight principles. Laws must be enforced, for example, in a manner consistent with their wording. Legal systems that violate these principles cannot achieve social order. They destroy any moral obligation to obey the law. (Fuller 1964, pp. 33-40).

Ronald Dworkin’s “The Model of Rules” argues that Hart’s model of law is incomplete. Courts often decide difficult cases according to legal principles that provide moral justifications for case outcomes. One example is the common law maxim that no man should profit from his own wrongful conduct. These legal principles are outside Hart’s definition of primary and secondary rules. (Dworkin 1967, pp. 23-24).

Hart’s legal positivism nevertheless exerts significant influence in American jurisprudence. Four factors enhance Hart’s influence. The first occurred in 1871 when Dean Christopher Langdell of Harvard Law School dropped Blackstone’s Commentaries from Harvard’s legal curriculum. Blackstone’s jurisprudence lost influence as other schools followed.

The second enhancing factor is the erosion of law’s moral dimension. Oliver Wendell Holmes, Jr. is a leading figure in this process. Holmes advocated for law without values and identified himself as a skeptic. Holmes defines truth as the majority vote of any nation that is more powerful than all the others. Holmes equates a jurist searching for validity criteria in natural law to the poor devil who must get drunk to satisfy his demand for the superlative. (Holmes 1918, p. 40).

Holmes’ “Path of the Law” presents an early form of positivism. Holmes argues for the separation of law and morality. Holmes supports banishing every word of moral significance from the law. He rejects every ethical obligation in contract law. Holmes advocates a “bad man” perspective that looks at law as a bad man who feels no obligation to obey it. This is an early statement of Hart’s “external point of view.” (Holmes 1997, pp. 991-997).

The third factor enhancing Hart’s influence is the erosion of law’s historical dimension. Dean Roscoe Pound of Harvard Law School illustrates its erosion. Pound’s “Mechanical Jurisprudence” advocates abandoning custom as a source of any law. Pound urged replacing the common law system based on custom with a civil code system based on statutes. (Pound 1908, 605-23).

The fourth factor enhancing Hart’s influence is the natural desire of judges to “make” new law. Blackstone’s “declaratory theory” forbids judge-made law, but Hart’s “penumbra doctrine” considers it an ordinary and necessary judicial function. One striking example of Hart’s influence is Griswold v. Connecticut, 281 U.S. 479 (1965). Griswold applies a penumbra analysis to imply a Constitutional right of privacy while admitting no such right appears in the language of the Constitution. The Supreme Court decided Roe v. Wade, 410 U.S. 113 (1973) based on Griswold’s implied right of privacy. The increased willingness of judges to legislate from the bench in 20th and 21st Century American courts is Hart’s most significant and controversial legacy in American jurisprudence.

13. A Fresh Approach

Augustine‘s City of God observes that kingdoms without justice are but great bands of robbers. Robbers become rulers, not by the removal of greed, but by the addition of impunity. (Augustine 1998, p.147-48). Validity standards are the primary means by which societies deny impunity to unjust rulers. Legal validity governs the enforceability of law, and the standard of legal validity controls the ruler’s ability to enforce his will through legal coercion.

Standards of legal validity are historically cyclical, and the cycle continued in the United States during the 21st Century. American law initially embraced Blackstone’s dual validity standards based on moral principles and legal custom. Centuries of challengers have eroded those standards. Bentham, Austin, Holmes, and Hart eroded Blackstone’s moral standard by advocating the separation of law from morality. Pound eroded Blackstone’s customary standard by advocating the abandonment of common law. Legal educators dropped Blackstone from their curriculum.

These challengers eroded Blackstone’s validity standards, but they did not supplant them. A validity schism divided American jurisprudence. There was no generally accepted validity standard in American law. Academic theorists and legal educators favored Hart for his analytical clarity. Liberal judges favored Hart for increasing their power to make new law. Practitioners and conservative judges favored Blackstone for his emphasis on consent of the governed, autonomy of law, predictability of law, and morally just decisions.

Two irreconcilable bodies of precedent emerge, one formulated by traditional judges who limit themselves to finding existing law, the other by positivist judges who make new law. As judges increasingly make new law, courts become unpredictable, ex post facto rulings increase, and laws are unevenly applied. Unelected federal judges set aside democratic resolutions of political questions and decide policy issues without public input. Justices devise or limit Constitutional rights according to personal preference to achieve their desired case outcome.

Despite fifty years of debate, the opposing camps remain estranged. Each side utilizes methods its opponent will never accept. Blackstone, for example, formulates his moral precepts in terms of divine law and human reason. This formulation is unpersuasive for two reasons. First, there is no general agreement regarding the terms of divine law, and many reject its very existence. Second, Blackstone adopts inconsistent views of human reason. On one hand, human reason is too “frail, imperfect, and blind” to generate just human laws. On the other hand, human reason is sufficient to generate the precepts of natural law from revelations of divine law.

Legal positivism is unpersuasive as well, insisting on a narrow philosophical method to formulate its standard of legal validity. Hart emphasizes “a purely analytical study of legal concepts, a study of the meaning of the distinctive vocabulary of the law.” (Hart 1958, p. 601). He describes all law as consisting of only two types of rules. Hart’s simplistic model of law is inadequate for three reasons.

First, Hart’s analysis excludes law’s historical and social contexts. Hart restricts his analysis to law’s linguistic context. Law is more than linguistics. It encompasses the entirety of the great variety of human affairs. Hart’s exclusion of these indispensible contexts commits the “analytical fallacy” described by John Dewey in “Context and Thought” (Dewey 1985, pp. 5-7).

Second, Hart’s standard of legal validity ignores the content of law. Hart only considers the pedigree of the law’s creation. Hart consequently accepts the validity of “morally iniquitous laws” whose content possesses “no moral justification or force whatsoever.” (Hart 1994, p. 268).

Hart ignores the grave consequences of enforcing “morally iniquitous” laws. For example, Hart validates legal systems if two conditions are met. First, citizens may take an external point of view toward primary rules. Obedience “from any motive whatsoever” is sufficient, permitting coercion through terror. Second, officials must take an internal point of view toward secondary rules. Objectively considered, the legal systems utilized by Stalin and Hitler satisfy both conditions.

Third, Hart’s model of law as rules is incomplete. Something important is missing from a legal philosophy that validates the Soviet and Nazi legal systems. That missing element is justice, and justice is a moral concept. As Ronald Dworkin explains, courts usually decide difficult cases according to legal principles that provide moral justifications for case outcomes. Hart’s model of rules excludes these principles. (Dworkin 1967, pp. 23-24).

Hart showed how to separate law from morality, but history showed why societies should not do so. Critics contend that a fresh approach is needed.

Neither Blackstone nor Hart assign legal history a significant role in formulating their validity standards. No major jurist since Cicero has done so. Nevertheless, a historical formulation of legal validity can avoid the problems described above. Unlike Blackstone, legal history does not require belief in a divine lawgiver, and unlike Hart, legal history does not ignore the content of law.

Legal history provides a long record of legal experimentation. A scientific approach identifies three principles that recur in just and stable legal systems. Legal systems without these principles repeatedly become arbitrary, unjust, and unstable.

The first principle is the principle of reason, which addresses the validity of law’s content. The principle of reason recognizes that every subject is a rational creature with a free will. To be stable, the legal system must treat its subjects as ends in themselves, and not as a mere means to another end. The legal system must also permit rational individuals to orient their own behavior in order to achieve a society based on ordered liberty. Procedural due process protects against the punishment of the innocent and the tyranny of the majority. Substantive due process enables laws to provide dependable guideposts to individuals in orienting their behavior.

The second principle is the principle of consent, which addresses the validity of law’s creation. This principle provides that the legitimacy of law derives from the consent of those subject to its power. Common law custom, the doctrine of stare decisis, and legislation sanctioned by the subjects’ legitimate representatives are all evidence of consent.

The third principle is the principle of autonomy, which addresses both the content and the creation of law. Laws must wield supremacy over political rulers. The ruler must be under the same laws as his subjects, and the laws must not be subject to arbitrary change to reflect the ruler’s will. To paraphrase de Bracton, the law must make the king. The king must not make the law. To paraphrase Aristotle, rightly constituted laws must be the final sovereign.

These principles operate in law’s moral and historical dimensions to restrain the ruler’s ability to enforce his will through legal coercion. Legal systems become unjust and unstable in the absence of such restraints. They project the power of the political ruler, but they are not valid legal systems. The history of the Western legal tradition is the history of revolutions against such systems. (Berman 1983).

14. References and Further Reading

Aquinas, Thomas. Treatise on Law (Summa Theologica, Questions 90-07). Ed. Ralph McInerny. Washington: Regnery, 1996. Print.
Aristotle. The Athenian Constitution. Trans. Sir Frederic G. Kenyon. Seaside, OR: Merchant, 2009. Print.
Aristotlte. Ethica Nichomachea. Trans. W.D. Ross. New York: Oxford UP, 2009. Print.
Aristotlte. Metaphysics. Trans. Joe Sachs. Santa Fe: Green Lion, 2002. Print.
Aristotlte. Physics. Trans. Robin Waterfield. Ed. David Bostock. Oxford: Oxford UP, 1996. Print.
Aristotlte. The Politics of Aristotle. Trans. Ernest Barker. Oxford: Oxford UP, 1946. Print.
Aristotlte. Rhetoric. Ed. W.D. Ross. Trans. W. Rhys Roberts. New York: Cosimo, 2010. Print.
Augustine. The City of God against the Pagans. Trans. R.W. Dyson. Cambridge: Cambridge UP, 1998. Print.
Austin, John. The Province of Jurisprudence Determined. Amherst, NY: Prometheus, 2000. Print.
Bentham, Jeremy. “Anarchical Fallacies; Being an Examination of the Declarations of Rights Issued During the French Revolution.” The Works of Jeremy Bentham. 11 vols. Edinburgh: William Tait, 1838-43. Print.
Bentham, Jeremy. A Comment on the Commentaries and A Fragment on Government. Ed. J.H. Burns and H.L.A. Hart. London: Athlone, 1977. Print.
Bentham, Jeremy. An Introduction to the Principles of Morals and Legislation. Oxford: Clarendon, 1907. Print.
Bentham, Jeremy. Of Laws in General. Ed. H.L.A. Hart. London: Athlone, 1970. Print.
Berman, Harold J. Law and Revolution: The Formation of the Western Legal Tradition. Cambridge: Harvard UP, 1983. Print.
Berman, Harold J. “Toward an Integrative Jurisprudence: Politics, Morality, History.” 76 (4) California Law Review (1988): 779-801. Print.
Blackstone, Sir William. Commentaries on the Laws of England. Vol. 1. New York: W.E. Dean, 1838. Print.
Cicero, De Officis (On Duties). Ed. M.T. Griffin and E.M. Atkins. Cambridge: Cambridge UP, 1991. Print.
Cicero, De Re Publica (On the Republic) and De Legibus (On the Laws). Trans. C.W. Keyes. Ed. Jeffrey Henderson. Bury St. Edmonds, UK: St. Edmondsbury, 2000. Print.
De Bracton, Henry. De Legibus et Consuetudinibus Angliae (On the Laws and Customs of England). Ed. George E. Woodbine. Trans. Samuel E. Thorne. 4 vols. Cambridge: Harvard UP, 1968. Print.
Dewey, John. “Context and Thought.” The Later Works of John Dewey. Ed. Jo Ann Boydston. Vol. 6. Carbondale, IL: S. Illinois UP, 1985. Print.
Dworkin, Ronald. “The Model of Rules.” U. Chi. L. Rev. 35 (1) (1967): 14-46. Print.
Fuller, Lon L. The Morality of Law. New Haven: Yale UP, 1964. Print.
Hamilton, Alexander, John Jay, and James Madison. “Federalist No. 63.” The Federalist Papers. Ed. Ernest O’Dell. Sundown, TX: CreateSpace, 2010. Print.
Hart, H. L. A. The Concept of Law. 2nd ed. Oxford: Clarendon, 1994. Print.
Hart, H. L. A. “Positivism and the Separation of Law and Morals.” Harv. L Rev. 71 (4) (1958): 593–629. Print.
Hesiod. Theogony, Works and Days, Shield. Trans. Apostolos N. Athanassakis. 2nd ed. Baltimore: Johns Hopkins Press, 2004. Print.
Holmes, Oliver Wendell, Jr. “Natural Law.” Harv. L. Rev. 32 (1) (1918): 40-44. Print.
Holmes, Oliver Wendell, Jr. “The Path of the Law.” Harv. L. Rev. 110 (5) (1997): 991-1009. Print.
Justinian. Corpus Juris Civilis, The Civil Law. Trans. S.P. Scott. 17 vols. Cincinnati: Central Trust, 1932. Print.
Plato. Crito. The Collected Dialogues of Plato, including the Letters. Trans. Lane Cooper. Ed. Edith Hamilton and Huntington Cairns. Princeton: Princeton UP, 1961. Print.
Plato. Protagoras. The Collected Dialogues of Plato, including the Letters. Trans. Lane Cooper. Ed. Edith Hamilton and Huntington Cairns. Princeton: Princeton UP, 1961. Print.
Plato. Gorgias. The Collected Dialogues of Plato, including the Letters. Trans. Lane Cooper. Ed. Edith Hamilton and Huntington Cairns. Princeton: Princeton UP, 1961. Print.
Plato. “Letter VII.” The Collected Dialogues of Plato, including the Letters. Trans. Lane Cooper. Ed. Edith Hamilton and Huntington Cairns. Princeton: Princeton UP, 1961. Print.
Plato. Laws. The Collected Dialogues of Plato, including the Letters. Trans. Lane Cooper. Ed. Edith Hamilton and Huntington Cairns. Princeton: Princeton UP, 1961. Print.
Plato. Theaetetus. The Collected Dialogues of Plato, including the Letters. Trans. Lane Cooper. Ed. Edith Hamilton and Huntington Cairns. Princeton: Princeton UP, 1961. Print.
Plato. The Republic. The Collected Dialogues of Plato, including the Letters. Trans. Lane Cooper. Ed. Edith Hamilton and Huntington Cairns. Princeton: Princeton UP, 1961. Print.
Plutarch. “Themistocles.” Plutarch’s Lives. Trans. Bernadotte Perrin. Cambridge: Harvard UP, 1914. Print.
Pound, Roscoe. “Mechanical Jurisprudence.” Colum. L. Rev. 8 (3) (1908): 605-623. Print.
Radbruch, Gustav. “Five Minutes of Legal Philosophy.” Trans. Bonnie Litschewski Paulson and Stanley L. Paulson. Oxford J. Legal Stud. 26 (1) (2006b): 13-15. Print.
Radbruch, Gustav. “Statutory Lawlessness and Supra-Statutory Law.” Trans. Bonnie Litschewski Paulson and Stanley L. Paulson. Oxford J. Legal Stud. 26 (1) (2006a): 1-11. Print.
Xenophon. Socrates’ Defence. Ed. Robin Waterfield. Trans. Hugh Tredennick and Robin Waterfield. New York: Penguin, 1990. Print.

Author Information

John O. Tyler, Jr.
Email: jtyler@hbu.edu
Houston Baptist University
U. S. A.

Socrates (469—399 B.C.E.)

Socrates is one of the few individuals whom one could say has so-shaped the cultural and intellectual development of the world that, without him, history would be profoundly different. He is best known for his association with the Socratic method of question and answer, his claim that he was ignorant (or aware of his own absence of knowledge), and his claim that the unexamined life is not worth living, for human beings. He was the inspiration for Plato, the thinker widely held to be the founder of the Western philosophical tradition. Plato in turn served as the teacher of Aristotle, thus establishing the famous triad of ancient philosophers: Socrates, Plato, and Aristotle. Unlike other philosophers of his time and ours, Socrates never wrote anything down but was committed to living simply and to interrogating the everyday views and popular opinions of those in his home city of Athens. At the age of 70, he was put to death at the hands of his fellow citizens on charges of impiety and corruption of the youth. His trial, along with the social and political context in which occurred, has warranted as much treatment from historians and classicists as his arguments and methods have from philosophers.

This article gives an overview of Socrates: who he was, what he thought, and his purported method. It is both historical and philosophical. At the same time, it contains reflections on the difficult nature of knowing anything about a person who never committed any of his ideas to the written word. Much of what is known about Socrates comes to us from Plato, although Socrates appears in the works of other ancient writers as well as those who follow Plato in the history of philosophy. This article recognizes that finding the original Socrates may be impossible, but it attempts to achieve a close approximation.

Biography: Who was Socrates?
1. The Historical Socrates
  1. Birth and Early Life
  2. Later Life and Trial
    1. The Peloponnesian War and the Threat to Democracy
    2. Greek Religion and Socrates’ Impiety
2. The Socratic Problem: the Philosophical Socrates
Content: What does Socrates Think?
Method: How Did Socrates Do Philosophy?
Legacy: How Have Other Philosophers Understood Socrates?
1. Hellenistic Philosophy
2. Modern Philosophy
  1. Hegel
  2. Kierkegaard
  3. Nietzsche
  4. Heidegger
  5. Gadamer
References and Further Reading

1. Biography: Who was Socrates?

a. The Historical Socrates

i. Birth and Early Life

Socrates was born in Athens in the year 469 B.C.E. to Sophroniscus, a stonemason, and Phaenarete, a midwife. His family was not extremely poor, but they were by no means wealthy, and Socrates could not claim that he was of noble birth like Plato. He grew up in the political deme or district of Alopece, and when he turned 18, began to perform the typical political duties required of Athenian males. These included compulsory military service and membership in the Assembly, the governing body responsible for determining military strategy and legislation.

In a culture that worshipped male beauty, Socrates had the misfortune of being born incredibly ugly. Many of our ancient sources attest to his rather awkward physical appearance, and Plato more than once makes reference to it (Theaetetus 143e, Symposium, 215a-c; also Xenophon Symposium 4.19, 5.5-7 and Aristophanes Clouds 362). Socrates was exophthalmic, meaning that his eyes bulged out of his head and were not straight but focused sideways. He had a snub nose, which made him resemble a pig, and many sources depict him with a potbelly. Socrates did little to help his odd appearance, frequently wearing the same cloak and sandals throughout both the day and the evening. Plato’s Symposium (174a) offers us one of the few accounts of his caring for his appearance.

As a young man Socrates was given an education appropriate for a person of his station. By the middle of the 5^th century B.C.E., all Athenian males were taught to read and write. Sophroniscus, however, also took pains to give his son an advanced cultural education in poetry, music, and athletics. In both Plato and Xenophon, we find a Socrates that is well versed in poetry, talented at music, and quite at-home in the gymnasium. In accordance with Athenian custom, his father also taught him a trade, though Socrates did not labor at it on a daily basis. Rather, he spent his days in the agora (the Athenian marketplace), asking questions of those who would speak with him. While he was poor, he quickly acquired a following of rich young aristocrats—one of whom was Plato—who particularly enjoyed hearing him interrogate those that were purported to be the wisest and most influential men in the city.

Socrates was married to Xanthippe, and according to some sources, had a second wife. Most suggest that he first married Xanthippe, and that she gave birth to his first son, Lamprocles. He is alleged to have married his second wife, Myrto, without dowry, and she gave birth to his other two sons, Sophroniscus and Menexenus. Various accounts attribute Sophroniscus to Xanthippe, while others even suggest that Socrates was married to both women simultaneously because of a shortage of males in Athens at the time. In accordance with Athenian custom, Socrates was open about his physical attraction to young men, though he always subordinated his physical desire for them to his desire that they improve the condition of their souls.

Socrates fought valiantly during his time in the Athenian military. Just before the Peloponnesian War with Sparta began in 431 B.C.E, he helped the Athenians win the battle of Potidaea (432 B.C.E.), after which he saved the life of Alcibiades, the famous Athenian general. He also fought as one of 7,000 hoplites aside 20,000 troops at the battle of Delium (424 B.C.E.) and once more at the battle of Amphipolis (422 B.C.E.). Both battles were defeats for Athens.

Despite his continued service to his city, many members of Athenian society perceived Socrates to be a threat to their democracy, and it is this suspicion that largely contributed to his conviction in court. It is therefore imperative to understand the historical context in which his trial was set.

ii. Later Life and Trial

1. The Peloponnesian War and the Threat to Democracy

Between 431—404 B.C.E. Athens fought one of its bloodiest and most protracted conflicts with neighboring Sparta, the war that we now know as the Peloponnesian War. Aside from the fact that Socrates fought in the conflict, it is important for an account of his life and trial because many of those with whom Socrates spent his time became either sympathetic to the Spartan cause at the very least or traitors to Athens at worst. This is particularly the case with those from the more aristocratic Athenian families, who tended to favor the rigid and restricted hierarchy of power in Sparta instead of the more widespread democratic distribution of power and free speech to all citizens that obtained in Athens. Plato more than once places in the mouth of his character Socrates praise for Sparta (Protagoras 342b, Crito 53a; cf. Republic 544c in which most people think the Spartan constitution is the best). The political regime of the Republic is marked by a small group of ruling elites that preside over the citizens of the ideal city.

There are a number of important historical moments throughout the war leading up to Socrates’ trial that figure in the perception of him as a traitor. Seven years after the battle of Amphipolis, the Athenian navy was set to invade the island of Sicily, when a number of statues in the city called “herms”, dedicated to the god Hermes, protector of travelers, were destroyed. Dubbed the ‘Mutilation of the Herms’ (415 B.C.E.), this event engendered not only a fear of those who might seek to undermine the democracy, but those who did not respect the gods. In conjunction with these crimes, Athens witnessed the profanation of the Eleusinian mysteries, religious rituals that were to be conducted only in the presence of priests but that were in this case performed in private homes without official sanction or recognition of any kind. Amongst those accused and persecuted on suspicion of involvement in the crimes were a number of Socrates’ associates, including Alcibiades, who was recalled from his position leading the expedition in Sicily. Rather than face prosecution for the crime, Alcibiades escaped and sought asylum in Sparta.

Though Alcibiades was not the only of Socrates’ associates implicated in the sacrilegious crimes (Charmides and Critias were suspected as well), he is arguably the most important. Socrates had by many counts been in love with Alcibiades and Plato depicts him pursuing or speaking of his love for him in many dialogues (Symposium 213c-d, Protagoras 309a, Gorgias 481d, Alcibiades I 103a-104c, 131e-132a). Alcibiades is typically portrayed as a wandering soul (Alcibiades I 117c-d), not committed to any one consistent way of life or definition of justice. Instead, he was a kind of cameleon-like flatterer that could change and mold himself in order to please crowds and win political favor (Gorgias 482a). In 411 B.C.E., a group of citizens opposed to the Athenian democracy led a coup against the government in hopes of establishing an oligarchy. Though the democrats put down the coup later that year and recalled Alcibiades to lead the Athenian fleet in the Hellespont, he aided the oligarchs by securing for them an alliance with the Persian satraps. Alcibiades therefore did not just aid the Spartan cause but allied himself with Persian interests as well. His association with the two principal enemies of Athens reflected poorly on Socrates, and Xenophon tells us that Socrates’ repeated association with and love for Alcibiades was instrumental in the suspicion that he was a Spartan apologist.

Sparta finally defeated Athens in 404 B.C.E., just five years before Socrates’ trial and execution. Instead of a democracy, they installed as rulers a small group of Athenians who were loyal to Spartan interests. Known as “The Thirty” or sometimes as the “Thirty Tyrants”, they were led by Critias, a known associate of Socrates and a member of his circle. Critias’ nephew Charmides, about whom we have a Platonic dialogue of the same name, was also a member. Though Critias put forth a law prohibiting Socrates from conducting discussions with young men under the age of 30, Socrates’ earlier association with him—as well as his willingness to remain in Athens and endure the rule of the Thirty rather than flee—further contributed to the growing suspicion that Socrates was opposed to the democratic ideals of his city.

The Thirty ruled tyrannically—executing a number of wealthy Athenians as well as confiscating their property, arbitrarily arresting those with democratic sympathies, and exiling many others—until they were overthrown in 403 B.C.E. by a group of democratic exiles returning to the city. Both Critias and Charmides were killed and, after a Spartan-sponsored peace accord, the democracy was restored. The democrats proclaimed a general amnesty in the city and thereby prevented politically motivated legal prosecutions aimed at redressing the terrible losses incurred during the reign of the Thirty. Their hope was to maintain unity during the reestablishment of their democracy.

One of Socrates’ main accusers, Anytus, was one of the democratic exiles that returned to the city to assist in the overthrow of the Thirty. Plato’s Meno, set in the year 402 B.C.E., imagines a conversation between Socrates and Anytus in which the latter argues that any citizen of Athens can teach virtue, an especially democratic view insofar as it assumes knowledge of how to live well is not the restricted domain of the esoteric elite or privileged few. In the discussion, Socrates argues that if one wants to know about virtue, one should consult an expert on virtue (Meno 91b-94e). The political turmoil of the city, rebuilding itself as a democracy after nearly thirty years of destruction and bloodshed, constituted a context in which many citizens were especially fearful of threats to their democracy that came not from the outside, but from within their own city.

While many of his fellow citizens found considerable evidence against Socrates, there was also historical evidence in addition to his military service for the case that he was not just a passive but an active supporter of the democracy. For one thing, just as he had associates that were known oligarchs, he also had associates that were supporters of the democracy, including the metic family of Cephalus and Socrates’ friend Chaerephon, the man who reported that the oracle at Delphi had proclaimed that no man was wiser than Socrates. Additionally, when he was ordered by the Thirty to help retrieve the democratic general Leon from the island of Salamis for execution, he refused to do so. His refusal could be understood not as the defiance of a legitimately established government but rather his allegiance to the ideals of due process that were in effect under the previously instituted democracy. Indeed, in Plato’s Crito, Socrates refuses to escape from prison on the grounds that he lived his whole life with an implied agreement with the laws of the democracy (Crito 50a-54d). Notwithstanding these facts, there was profound suspicion that Socrates was a threat to the democracy in the years after the end of the Peloponnesian War. But because of the amnesty, Anytus and his fellow accusers Meletus and Lycon were prevented from bringing suit against Socrates on political grounds. They opted instead for religious grounds.

2. Greek Religion and Socrates’ Impiety

Because of the amnesty the charges made against Socrates were framed in religious terms. As recounted by Diogenes Laertius (1.5.40), the charges were stated as follows: “Socrates does criminal wrong by not recognizing the gods that the city recognizes, and furthermore by introducing new divinities; and he also does criminal wrong by corrupting the youth” (other accounts: Xenophon Memorabilia I.I.1 and Apology 11-12, Plato, Apology 24b and Euthyphro 2c-3b). Many people understood the charge about corrupting the youth to signify that Socrates taught his subversive views to others, a claim that he adamantly denies in his defense speech by claiming that he has no wisdom to teach (Plato, Apology 20c) and that he cannot be held responsible for the actions of those that heard him speak (Plato, Apology 33a-c).

It is now customary to refer to the principal written accusation on the deposition submitted to the Athenian court as an accusation of impiety, or unholiness. Rituals, ceremonies, and sacrifices that were officially sanctioned by the city and its officials marked ancient Greek religion. The sacred was woven into the everyday experience of citizens who demonstrated their piety by correctly observing their ancestral traditions. Interpretation of the gods at their temples was the exclusive domain of priests appointed and recognized by the city. The boundary and separation between the religious and the secular that we find in many countries today therefore did not obtain in Athens. A religious crime was consequently an offense not just against the gods, but also against the city itself.

Socrates and his contemporaries lived in a polytheistic society, a society in which the gods did not create the world but were themselves created. Socrates would have been brought up with the stories of the gods recounted in Hesiod and Homer, in which the gods were not omniscient, omnibenevolent, or eternal, but rather power-hungry super-creatures that regularly intervened in the affairs of human beings. One thinks for example of Aphrodite saving Paris from death at the hands of Menelaus (Homer, Iliad 3.369-382) or Zeus sending Apollo to rescue the corpse of Sarpedon after his death in battle (Homer, Iliad 16.667-684). Human beings were to fear the gods, sacrifice to them, and honor them with festivals and prayers.

Socrates instead seemed to have a conception of the divine as always benevolent, truthful, authoritative, and wise. For him, divinity always operated in accordance with the standards of rationality. This conception of divinity, however, dispenses with the traditional conception of prayer and sacrifice as motivated by hopes for material payoff. Socrates’ theory of the divine seemed to make the most important rituals and sacrifices in the city entirely useless, for if the gods are all good, they will benefit human beings regardless of whether or not human beings make offerings to them. Jurors at his trial might have thought that, without the expectation of material reward or protection from the gods, Socrates was disconnecting religion from its practical roots and its connection with the civic identity of the city.

While Socrates was critical of blind acceptance of the gods and the myths we find in Hesiod and Homer, this in itself was not unheard of in Athens at the time. Solon, Xenophanes, Heraclitus, and Euripides had all spoken against the capriciousness and excesses of the gods without incurring penalty. It is possible to make the case that Socrates’ jurors might not have indicted him solely on questioning the gods or even of interrogating the true meaning of piety. Indeed, there was no legal definition of piety in Athens at the time, and jurors were therefore in a similar situation to the one in which we find Socrates in Plato’s Euthyphro, that is, in need of an inquiry into what the nature of piety truly is. What seems to have concerned the jurors was not only Socrates’ challenge to the traditional interpretation of the gods of the city, but his seeming allegiance to an entirely novel divine being, unfamiliar to anyone in the city.

This new divine being is what is known as Socrates’ daimon. Though it has become customary to think of a daimon as a spirit or quasi-divinity (for example, Symposium 202e-203a), in ancient Greek religion it was not solely a specific class of divine being but rather a mode of activity, a force that drives a person when no particular divine agent can be named (Burkett, 180). Socrates claimed to have heard a sign or voice from his days as a child that accompanied him and forbid him to pursue certain courses of action (Plato, Apology 31c-d, 40a-b, Euthydemus 272e-273a, Euthyphro 3b, Phaedrus 242b, Theages 128-131a, Theaetetus 150c-151b, Rep 496c; Xenophon, Apology 12, Memorabilia 1.1.3-5). Xenophon adds that the sign also issued positive commands (Memorablia 1.1.4, 4.3.12, 4.8.1, Apology 12). This sign was accessible only to Socrates, private and internal to his own mind. Whether Socrates received moral knowledge of any sort from the sign is a matter of scholarly debate, but beyond doubt is the strangeness of Socrates’ insistence that he took private instructions from a deity that was unlicensed by the city. For all the jurors knew, the deity could have been hostile to Athenian interests. Socrates’ daimon was therefore extremely influential in his indictment on the charge of worshipping new gods unknown to the city (Plato, Euthyphro 3b, Xenophon, Memorabilia I.1.2).

Whereas in Plato’s Apology Socrates makes no attempt to reconcile his divine sign with traditional views of piety, Xenophon’s Socrates argues that just as there are those who rely on birdcalls and receive guidance from voices, so he too is influenced by his daimon. However, Socrates had no officially sanctioned religious role in the city. As such, his attempt to assimilate himself to a seer or necromancer appointed by the city to interpret divine signs actually may have undermined his innocence, rather than help to establish it. His insistence that he had direct, personal access to the divine made him appear guilty to enough jurors that he was sentenced to death.

b. The Socratic Problem: the Philosophical Socrates

The Socratic problem is the problem faced by historians of philosophy when attempting to reconstruct the ideas of the original Socrates as distinct from his literary representations. While we know many of the historical details of Socrates’ life and the circumstances surrounding his trial, Socrates’ identity as a philosopher is much more difficult to establish. Because he wrote nothing, what we know of his ideas and methods comes to us mainly from his contemporaries and disciples.

There were a number of Socrates’ followers who wrote conversations in which he appears. These works are what are known as the logoi sokratikoi, or Socratic accounts. Aside from Plato and Xenophon, most of these dialogues have not survived. What we know of them comes to us from other sources. For example, very little survives from the dialogues of Antisthenes, whom Xenophon reports as one of Socrates’ leading disciples. Indeed, from polemics written by the rhetor Isocrates, some scholars have concluded that he was the most prominent Socratic in Athens for the first decade following Socrates’ death. Diogenes Laertius (6.10-13) attributes to Antisthenes a number of views that we recognize as Socratic, including that virtue is sufficient for happiness, the wise man is self-sufficient, only the virtuous are noble, the virtuous are friends, and good things are morally fine and bad things are base.

Aeschines of Sphettus wrote seven dialogues, all of which have been lost. It is possible for us to reconstruct the plots of two of them: the Alcibiades—in which Socrates shames Alcibiades into admitting he needs Socrates’ help to be virtuous—and the Aspasia—in which Socrates recommends the famous wife of Pericles as a teacher for the son of Callias. Aeschines’ dialogues focus on Socrates’ ability to help his interlocutor acquire self-knowledge and better himself.

Phaedo of Elis wrote two dialogues. His central use of Socrates is to show that philosophy can improve anyone regardless of his social class or natural talents. Euclides of Megara wrote six dialogues, about which we know only their titles. Diogenes Laertius reports that he held that the good is one, that insight and prudence are different names for the good, and that what is opposed to the good does not exist. All three are Socratic themes. Lastly, Aristippus of Cyrene wrote no Socratic dialogues but is alleged to have written a work entitled To Socrates.

The two Socratics on whom most of our philosophical understanding of Socrates depends are Plato and Xenophon. Scholars also rely on the works of the comic playwright Aristophanes and Plato’s most famous student, Aristotle.

i. Origin of the Socratic Problem

The Socratic problem first became pronounced in the early 19^th century with the influential work of Friedrich Schleiermacher. Until this point, scholars had largely turned to Xenophon to identify what the historical Socrates thought. Schleiermacher argued that Xenophon was not a philosopher but rather a simple citizen-soldier, and that his Socrates was so dull and philosophically uninteresting that, reading Xenophon alone, it would be difficult to understand the reputation accorded Socrates by so many of his contemporaries and nearly all the schools of philosophy that followed him. The better portrait of Socrates, Schleiermacher claimed, comes to us from Plato.

Though many scholars have since jettisoned Xenophon as a legitimate source for representing the philosophical views of the historical Socrates, they remain divided over the reliability of the other three sources. For one thing, Aristophanes was a comic playwright, and therefore took considerable poetic license when scripting his characters. Aristotle, born 15 years after Socrates’ death, hears about Socrates primarily from Plato. Plato himself wrote dialogues or philosophical dramas, and thus cannot be understood to be presenting his readers with exact replicas or transcriptions of conversations that Socrates actually had. Furthermore, many scholars think that Plato’s so-called middle and late dialogues do not present the views of the historical Socrates.

We therefore see the difficult nature of the Socratic problem: because we don’t seem to have any consistently reliable sources, finding the true Socrates or the original Socrates proves to be an impossible task. What we are left with, instead, is a composite picture assembled from various literary and philosophical components that give us what we might think of as Socratic themes or motifs.

ii. Aristophanes

Born in 450 B.C.E., Aristophanes wrote a number of comic plays intended to satirize and caricature many of his fellow Athenians. His Clouds (423 B.C.E.) was so instrumental in parodying Socrates and painting him as a dangerous intellectual capable of corrupting the entire city that Socrates felt compelled in his trial defense to allude to the bad reputation he acquired as a result of the play (Plato, Apology 18a-b, 19c). Aristophanes was much closer in age to Socrates than Plato and Xenophon, and as such is the only one of our sources exposed to Socrates in his younger years.

In the play, Socrates is the head of a phrontistêrion, a school of learning where students are taught the nature of the heavens and how to win court cases. Socrates appears in a swing high above the stage, purportedly to better study the heavens. His patron deities, the clouds, represent his interest in meteorology and may also symbolize the lofty nature of reasoning that may take either side of an argument. The main plot of the play centers on an indebted man called Strepsiades, whose son Phidippides ends up in the school to learn how to help his father avoid paying off his debts. By the end of the play, Phidippides has beaten his father, arguing that it is perfectly reasonable to do so on the grounds that, just as it is acceptable for a father to spank his son for his own good, so it is acceptable for a son to hit a father for his own good. In addition to the theme that Socrates corrupts the youth, we therefore also find in the Clouds the origin of the rumor that Socrates makes the stronger argument the weaker and the weaker argument the stronger. Indeed, the play features a personification of the Stronger Argument—which represents traditional education and values—attacked by the Weaker Argument—which advocates a life of pleasure.

While the Clouds is Aristophanes’ most famous and comprehensive attack on Socrates, Socrates appears in other of his comedies as well. In the Birds (414 B.C.E.), Aristophanes coins a Greek verb based on Socrates’ name to insinuate that Socrates was truly a Spartan sympathizer (1280-83). Young men who were found “Socratizing” were expressing their admiration of Sparta and its customs. And in the Frogs (405), the Chorus claims that it is not refined to keep company with Socrates, who ignores the poets and wastes time with ‘frivolous words’ and ‘pompous word-scraping’ (1491-1499).

Aristophanes’ Socrates is a kind of variegated caricature of trends and new ideas emerging in Athens that he believed were threatening to the city. We find a number of such themes prevalent in Presocratic philosophy and the teachings of the Sophists, including those about natural science, mathematics, social science, ethics, political philosophy, and the art of words. Amongst other things, Aristophanes was troubled by the displacement of the divine through scientific explanations of the world and the undermining of traditional morality and custom by explanations of cultural life that appealed to nature instead of the gods. Additionally, he was reticent about teaching skill in disputation, for fear that a clever speaker could just as easily argue for the truth as argue against it. These issues constitute what is sometimes called the “new learning” developing in 5^th century B.C.E. Athens, for which the Aristophanic Socrates is the iconic symbol.

iii. Xenophon

Born in the same decade as Plato (425 B.C.E.), Xenophon lived in the political deme of Erchia. Though he knew Socrates he would not have had as much contact with him as Plato did. He was not present in the courtroom on the day of Socrates’ trial, but rather heard an account of it later on from Hermogenes, a member of Socrates’ circle. His depiction of Socrates is found principally in four works: Apology—in which Socrates gives a defense of his life before his jurors—Memorabilia—in which Xenophon himself explicates the charges against Socrates and tries to defend him—Symposium—a conversation between Socrates and his friends at a drinking party—and Oeconomicus—a Socratic discourse on estate management. Socrates also appears in Xenophon’s Hellenica and Anabasis.

Xenophon’s reputation as a source on the life and ideas of Socrates is one on which scholars do not always agree. Largely thought to be a significant source of information about Socrates before the 19^th century, for most of the 20^th century Xenophon’s ability to depict Socrates as a philosopher was largely called into question. Following Schleiermacher, many argued that Xenophon himself was either a bad philosopher who did not understand Socrates, or not a philosopher at all, more concerned with practical, everyday matters like economics. However, recent scholarship has sought to challenge this interpretation, arguing that it assumes an understanding of philosophy as an exclusively speculative and critical endeavor that does not attend to the ancient conception of philosophy as a comprehensive way of life.

While Plato will likely always remain the principal source on Socrates and Socratic themes, Xenophon’s Socrates is distinct in philosophically interesting ways. He emphasizes the values of self-mastery (enkrateia), endurance of physical pain (karteria), and self-sufficiency (autarkeia). For Xenophon’s Socrates, self-mastery or moderation is the foundation of virtue (Memorabilia, 1.5.4). Whereas in Plato’s Apology the oracle tells Chaerephon that no one is wiser than Socrates, in Xenophon’s Apology Socrates claims that the oracle told Chaerephon that “no man was more free than I, more just, and more moderate” (Xenophon, Apology, 14).

Part of Socrates’ freedom consists in his freedom from want, precisely because he has mastered himself. As opposed to Plato’s Socrates, Xenophon’s Socrates is not poor, not because he has much, but because he needs little. Oeconomicus 11.3 for instance shows Socrates displeased with those who think him poor. One can be rich even with very little on the condition that one has limited his needs, for wealth is just the excess of what one has over what one requires. Socrates is rich because what he has is sufficient for what he needs (Memorabilia 1.2.1, 1.3.5, 4.2.38-9).

We also find Xenophon attributing to Socrates a proof of the existence of God. The argument holds that human beings are the product of an intelligent design, and we therefore should conclude that there is a God who is the maker (dēmiourgos) or designer of all things (Memorabilia 1.4.2-7). God creates a systematically ordered universe and governs it in the way our minds govern our bodies (Memorabilia 1.4.1-19, 4.3.1-18). While Plato’s Timaeus tells the story of a dēmiourgos creating the world, it is Timaeus, not Socrates, who tells the story. Indeed, Socrates speaks only sparingly at the beginning of the dialogue, and most scholars do not count as Socratic the cosmological arguments therein.

iv. Plato

Plato was Socrates’ most famous disciple, and the majority of what most people know about Socrates is known about Plato’s Socrates. Plato was born to one of the wealthiest and politically influential families in Athens in 427 B.C.E., the son of Ariston and Perictione. His brothers were Glaucon and Adeimantus, who are Socrates’ principal interlocutors for the majority of the Republic. Though Socrates is not present in every Platonic dialogue, he is in the majority of them, often acting as the main interlocutor who drives the conversation.

The attempt to extract Socratic views from Plato’s texts is itself a notoriously difficult problem, bound up with questions about the order in which Plato composed his dialogues, one’s methodological approach to reading them, and whether or not Socrates, or anyone else for that matter, speaks for Plato. Readers interested in the details of this debate should consult “Plato.” Generally speaking, the predominant view of Plato’s Socrates in the English-speaking world from the middle to the end of the 20^th century was simply that he was Plato’s mouthpiece. In other words, anything Socrates says in the dialogues is what Plato thought at the time he wrote the dialogue. This view, put forth by the famous Plato scholar Gregory Vlastos, has been challenged in recent years, with some scholars arguing that Plato has no mouthpiece in the dialogues (see Cooper xxi-xxiii). While we can attribute to Plato certain doctrines that are consistent throughout his corpus, there is no reason to think that Socrates, or any other speaker, always and consistently espouses these doctrines.

The main interpretive obstacle for those seeking the views of Socrates from Plato is the question of the order of the dialogues. Thrasyllus, the 1^st century (C.E.) Platonist who was the first to arrange the dialogues according to a specific paradigm, organized the dialogues into nine tetralogies, or groups of four, on the basis of the order in which he believed they should be read. Another approach, customary for most scholars by the late 20^th century, groups the dialogues into three categories on the basis of the order in which Plato composed them. Plato begins his career, so the narrative goes, representing his teacher Socrates in typically short conversations about ethics, virtue, and the best human life. These are “early” dialogues. Only subsequently does Plato develop his own philosophical views—the most famous of which is the doctrine of the Forms or Ideas—that Socrates defends. These “middle” dialogues put forth positive doctrines that are generally thought to be Platonic and not Socratic. Finally, towards the end of his life, Plato composes dialogues in which Socrates typically either hardly features at all or is altogether absent. These are the “late” dialogues.

There are a number of complications with this interpretive thesis, and many of them focus on the portrayal of Socrates. Though the Gorgias is an early dialogue, Socrates concludes the dialogue with a myth that some scholars attribute to a Pythagorean influence on Plato that he would not have had during Socrates’ lifetime. Though the Parmenides is a middle dialogue, the younger Socrates speaks only at the beginning before Parmenides alone speaks for the remainder of the dialogue. While the Philebus is a late dialogue, Socrates is the main speaker. Some scholars identify the Meno as an early dialogue because Socrates refutes Meno’s attempts to articulate the nature of virtue. Others, focusing on Socrates’ use of the theory of recollection and the method of hypothesis, argue that it is a middle dialogue. Finally, while Plato’s most famous work the Republic is a middle dialogue, some scholars make a distinction within the Republic itself. The first book, they argue, is Socratic, because in it we find Socrates refuting Thrasymachus’ definition of justice while maintaining that he knows nothing about justice. The rest of the dialogue they claim, with its emphasis on the division of the soul and the metaphysics of the Forms, is Platonic.

To discern a consistent Socrates in Plato is therefore a difficult task. Instead of speaking about chronology of composition, contemporary scholars searching for views that are likely to have been associated with the historical Socrates generally focus on a group of dialogues that are united by topical similarity. These “Socratic dialogues” feature Socrates as the principal speaker, challenging his interlocutor to elaborate on and critically examine his own views while typically not putting forth substantive claims of his own. These dialogues—including those that some scholars think are not written by Plato and those that most scholars agree are not written by Plato but that Thrasyllus included in his collection—are as follows: Euthyphro, Apology, Crito, Alcibiades I, Alcibiades II, Hipparchus, Rival Lovers, Theages, Charmides, Laches, Lysis, Euthydemus, Protagoras, Gorgias, Meno, Greater Hippias, Lesser Hippias, Ion, Menexenus, Clitophon, Minos. Some of the more famous positions Socrates defends in these dialogues are covered in the content section.

v. Aristotle

Aristotle was born in 384 B.C.E., 15 years after the death of Socrates. At the age of eighteen, he went to study at Plato’s Academy, and remained there for twenty years. Afterwards, he traveled throughout Asia and was invited by Phillip II of Macedon to tutor his son Alexander, known to history as Alexander the Great. While Aristotle would never have had the chance to meet Socrates, we have in his writings an account of both Socrates’ method and the topics about which he had conversations. Given the likelihood that Aristotle heard about Socrates from Plato and those at his Academy, it is not surprising that most of what he says about Socrates follows the depiction of him in the Platonic dialogues.

Aristotle related four concrete points about Socrates. The first is that Socrates asked questions without supplying an answer of his own, because he claimed to know nothing (De Elenchis Sophisticus 1836b6-8). The picture of Socrates here is consistent with that of Plato’s Apology. Second, Aristotle claims that Socrates never asked questions about nature, but concerned himself only with ethical questions. Aristotle thus attributes to the historical Socrates both the method and topics we find in Plato’s Socratic dialogues.

Third, Aristotle claims that Socrates is the first to have employed epagōgē, a word typically rendered in English as “induction.” This translation, however, is misleading, lest we impute to Socrates a preference for inductive reasoning as opposed to deductive reasoning. The term better indicates that Socrates was fond or arguing via the use of analogy. For instance, just as a doctor does not practice medicine for himself but for the best interest of his patient, so the ruler in the city takes no account of his own personal profit, but is rather interested in caring for his citizens (Republic 342d-e).

The fourth and final claim Aristotle makes about Socrates itself has two parts. First, Socrates was the first to ask the question, ti esti: what is it? For example, if someone were to suggest to Socrates that our children should grow up to be courageous, he would ask, what is courage? That is, what is the universal definition or nature that holds for all examples of courage? Second, as distinguished from Plato, Socrates did not separate universals from their particular instantiations. For Plato, the noetic object, the knowable thing, is the separate universal, not the particular. Socrates simply asked the “what is it” question (on this and the previous two points, see Metaphysics I.6.987a29-b14; cf. b22-24, b27-33, and see XIII.4.1078b12-34).

2. Content: What does Socrates Think?

Given the nature of these sources, the task of recounting what Socrates thought is not an easy one. Nonetheless, reading Plato’s Apology, it is possible to articulate a number of what scholars today typically associate with Socrates. Plato the author has his Socrates claim that Plato was present in the courtroom for Socrates’ defense (Apology 34a), and while this cannot mean that Plato records the defense as a word for word transcription, it is the closest thing we have to an account of what Socrates actually said at a concrete point in his life.

a. Presocratic Philosophy and the Sophists

Socrates opens his defense speech by defending himself against his older accusers (Apology 18a), claiming they have poisoned the minds of his jurors since they were all young men. Amongst these accusers was Aristophanes. In addition to the claim that Socrates makes the worse argument into the stronger, there is a rumor that Socrates idles the day away talking about things in the sky and below the earth. His reply is that he never discusses such topics (Apology 18a-c). Socrates is distinguishing himself here not just from the sophists and their alleged ability to invert the strength of arguments, but from those we have now come to call the Presocratic philosophers.

The Presocratics were not just those who came before Socrates, for there are some Presocratic philosophers who were his contemporaries. The term is sometimes used to suggest that, while Socrates cared about ethics, the Presocratic philosophers did not. This is misleading, for we have evidence that a number of Presocratics explored ethical issues. The term is best used to refer to the group of thinkers whom Socrates did not influence and whose fundamental uniting characteristic was that they sought to explain the world in terms of its own inherent principles. The 6^th cn. Milesian Thales, for instance, believed that the fundamental principle of all things was water. Anaximander believed the principle was the indefinite (apeiron), and for Anaxamines it was air. Later in Plato’s Apology (26d-e), Socrates rhetorically asks whether Meletus thinks he is prosecuting Anaxagoras, the 5^th cn. thinker who argued that the universe was originally a mixture of elements that have since been set in motion by Nous, or Mind. Socrates suggests that he does not engage in the same sort of cosmological inquiries that were the main focus of many Presocratics.

The other group against which Socrates compares himself is the Sophists, learned men who travelled from city to city offering to teach the youth for a fee. While he claims he thinks it an admirable thing to teach as Gorgias, Prodicus, or Hippias claim they can (Apology 20a), he argues that he himself does not have knowledge of human excellence or virtue (Apology 20b-c). Though Socrates inquires after the nature of virtue, he does not claim to know it, and certainly does not ask to be paid for his conversations.

b. Socratic Themes in Plato’s Apology

i. Socratic Ignorance

Plato’s Socrates moves next to explain the reason he has acquired the reputation he has and why so many citizens dislike him. The oracle at Delphi told Socrates’ friend Chaerephon, “no one is wiser than Socrates” (Apology 21a). Socrates explains that he was not aware of any wisdom he had, and so set out to find someone who had wisdom in order to demonstrate that the oracle was mistaken. He first went to the politicians but found them lacking wisdom. He next visited the poets and found that, though they spoke in beautiful verses, they did so through divine inspiration, not because they had wisdom of any kind. Finally, Socrates found that the craftsmen had knowledge of their own craft, but that they subsequently believed themselves to know much more than they actually did. Socrates concluded that he was better off than his fellow citizens because, while they thought they knew something and did not, he was aware of his own ignorance. The god who speaks through the oracle, he says, is truly wise, whereas human wisdom is worth little or nothing (Apology 23a).

This awareness of one’s own absence of knowledge is what is known as Socratic ignorance, and it is arguably the thing for which Socrates is most famous. Socratic ignorance is sometimes called simple ignorance, to be distinguished from the double ignorance of the citizens with whom Socrates spoke. Simple ignorance is being aware of one’s own ignorance, whereas double ignorance is not being aware of one’s ignorance while thinking that one knows. In showing many influential figures in Athens that they did not know what they thought they did, Socrates came to be despised in many circles.

It is worth nothing that Socrates does not claim here that he knows nothing. He claims that he is aware of his ignorance and that whatever it is that he does know is worthless. Socrates has a number of strong convictions about what makes for an ethical life, though he cannot articulate precisely why these convictions are true. He believes for instance that it is never just to harm anyone, whether friend or enemy, but he does not, at least in Book I of the Republic, offer a systematic account of the nature of justice that could demonstrate why this is true. Because of his insistence on repeated inquiry, Socrates has refined his convictions such that he can both hold particular views about justice while maintaining that he does not know the complete nature of justice.

We can see this contrast quite clearly in Socrates’ cross-examination of his accuser Meletus. Because he is charged with corrupting the youth, Socrates inquires after who it is that helps the youth (Apology, 24d-25a). In the same way that we take a horse to a horse trainer to improve it, Socrates wants to know the person to whom we take a young person to educate him and improve him. Meletus’ silence condemns him: he has never bothered to reflect on such matters, and therefore is unaware of his ignorance about matters that are the foundation of his own accusation (Apology 25b-c). Whether or not Socrates—or Plato for that matter—actually thinks it is possible to achieve expertise in virtue is a subject on which scholars disagree.

ii. Priority of the Care of the Soul

Throughout his defense speech (Apology 20a-b, 24c-25c, 31b, 32d, 36c, 39d) Socrates repeatedly stresses that a human being must care for his soul more than anything else (see also Crito 46c-47d, Euthyphro 13b-c, Gorgias 520a4ff). Socrates found that his fellow citizens cared more for wealth, reputation, and their bodies while neglecting their souls (Apology 29d-30b). He believed that his mission from the god was to examine his fellow citizens and persuade them that the most important good for a human being was the health of the soul. Wealth, he insisted, does not bring about human excellence or virtue, but virtue makes wealth and everything else good for human beings (Apology 30b).

Socrates believes that his mission of caring for souls extends to the entirety of the city of Athens. He argues that the god gave him to the city as a gift and that his mission is to help improve the city. He thus attempts to show that he is not guilty of impiety precisely because everything he does is in response to the oracle and at the service of the god. Socrates characterizes himself as a gadfly and the city as a sluggish horse in need of stirring up (Apology 30e). Without philosophical inquiry, the democracy becomes stagnant and complacent, in danger of harming itself and others. Just as the gadfly is an irritant to the horse but rouses it to action, so Socrates supposes that his purpose is to agitate those around him so that they begin to examine themselves. One might compare this claim with Socrates’ assertion in the Gorgias that, while his contemporaries aim at gratification, he practices the true political craft because he aims at what is best (521d-e). Such comments, in addition to the historical evidence that we have, are Socrates’ strongest defense that he is not only not a burden to the democracy but a great asset to it.

iii. The Unexamined Life

After the jury has convicted Socrates and sentenced him to death, he makes one of the most famous proclamations in the history of philosophy. He tells the jury that he could never keep silent, because “the unexamined life is not worth living for human beings” (Apology 38a). We find here Socrates’ insistence that we are all called to reflect upon what we believe, account for what we know and do not know, and generally speaking to seek out, live in accordance with, and defend those views that make for a well lived and meaningful life.

Some scholars call attention to Socrates’ emphasis on human nature here, and argue that the call to live examined lives follows from our nature as human beings. We are naturally directed by pleasure and pain. We are drawn to power, wealth and reputation, the sorts of values to which Athenians were drawn as well. Socrates’ call to live examined lives is not necessarily an insistence to reject all such motivations and inclinations but rather an injunction to appraise their true worth for the human soul. The purpose of the examined life is to reflect upon our everyday motivations and values and to subsequently inquire into what real worth, if any, they have. If they have no value or indeed are even harmful, it is upon us to pursue those things that are truly valuable.

One can see in reading the Apology that Socrates examines the lives of his jurors during his own trial. By asserting the primacy of the examined life after he has been convicted and sentenced to death, Socrates, the prosecuted, becomes the prosecutor, surreptitiously accusing those who convicted him of not living a life that respects their own humanity. He tells them that by killing him they will not escape examining their lives. To escape giving an account of one’s life is neither possible nor good, Socrates claims, but it is best to prepare oneself to be as good as possible (Apology 39d-e).

We find here a conception of a well-lived life that differs from one that would likely be supported by many contemporary philosophers. Today, most philosophers would argue that we must live ethical lives (though what this means is of course a matter of debate) but that it is not necessary for everyone to engage in the sort of discussions Socrates had every day, nor must one do so in order to be considered a good person. A good person, we might say, lives a good life insofar as he does what is just, but he does not necessarily need to be consistently engaged in debates about the nature of justice or the purpose of the state. No doubt Socrates would disagree, not just because the law might be unjust or the state might do too much or too little, but because, insofar as we are human beings, self-examination is always beneficial to us.

c. Other Socratic Positions and Arguments

In addition to the themes one finds in the Apology, the following are a number of other positions in the Platonic corpus that are typically considered Socratic.

i. Unity of Virtue; All Virtue is Knowledge

In the Protagoras (329b-333b) Socrates argues for the view that all of the virtues—justice, wisdom, courage, piety, and so forth—are one. He provides a number of arguments for this thesis. For example, while it is typical to think that one can be wise without being temperate, Socrates rejects this possibility on the grounds that wisdom and temperance both have the same opposite: folly. Were they truly distinct, they would each have their own opposites. As it stands, the identity of their opposites indicates that one cannot possess wisdom without temperance and vice versa.

This thesis is sometimes paired with another Socratic, view, that is, that virtue is a form of knowledge (Meno 87e-89a; cf. Euthydemus 278d-282a). Things like beauty, strength, and health benefit human beings, but can also harm them if they are not accompanied by knowledge or wisdom. If virtue is to be beneficial it must be knowledge, since all the qualities of the soul are in themselves neither beneficial not harmful, but are only beneficial when accompanied by wisdom and harmful when accompanied by folly.

ii. No One Errs Knowingly/No One Errs Willingly

Socrates famously declares that no one errs or makes mistakes knowingly (Protagoras 352c, 358b-b). Here we find an example of Socrates’ intellectualism. When a person does what is wrong, their failure to do what is right is an intellectual error, or due to their own ignorance about what is right. If the person knew what was right, he would have done it. Hence, it is not possible for someone simultaneously to know what is right and to do what is wrong. If someone does what is wrong, they do so because they do not know what is right, and if they claim they have known what was right at the time when they committed the wrong, they are mistaken, for had they truly known what was right, they would have done it.

Socrates therefore denies the possibility of akrasia, or weakness of the will. No one errs willingly (Protagoras 345c4-e6). While it might seem that Socrates is equivocating between knowingly and willingly, a look at Gorgias 466a-468e helps clarify his thesis. Tyrants and orators, Socrates tells Polus, have the least power of any member of the city because they do not do what they want. What they do is not good or beneficial even though human beings only want what is good or beneficial. The tyrant’s will, corrupted by ignorance, is in such a state that what follows from it will necessarily harm him. Conversely, the will that is purified by knowledge is in such a state that what follows from it will necessarily be beneficial.

iii. All Desire is for the Good

One of the premises of the argument just mentioned is that human beings only desire the good. When a person does something for the sake of something else, it is always the thing for the sake of which he is acting that he wants. All bad things or intermediate things are done not for themselves but for the sake of something else that is good. When a tyrant puts someone to death, for instance, he does this because he thinks it is beneficial in some way. Hence his action is directed towards the good because this is what he truly wants (Gorgias 467c-468b).

A similar version of this argument is in the Meno, 77b-78b. Those that desire bad things do not know that they are truly bad; otherwise, they would not desire them. They do not naturally desire what is bad but rather desire those things that they believe to be good but that are in fact bad. They desire good things even though they lack knowledge of what is actually good.

iv. It is Better to Suffer an Injustice Than to Commit One

Socrates infuriates Polus with the argument that it is better to suffer an injustice than commit one (Gorgias 475a-d). Polus agrees that it is more shameful to commit an injustice, but maintains it is not worse. The worst thing, in his view, is to suffer injustice. Socrates argues that, if something is more shameful, it surpasses in either badness or pain or both. Since committing an injustice is not more painful than suffering one, committing an injustice cannot surpass in pain or both pain and badness. Committing an injustice surpasses suffering an injustice in badness; differently stated, committing an injustice is worse than suffering one. Therefore, given the choice between the two, we should choose to suffer rather than commit an injustice.

This argument must be understood in terms of the Socratic emphasis on the care of the soul. Committing an injustice corrupts one’s soul, and therefore committing injustice is the worst thing a person can do to himself (cf. Crito 47d-48a, Republic I 353d-354a). If one commits injustice, Socrates goes so far as to claim that it is better to seek punishment than avoid it on the grounds that the punishment will purge or purify the soul of its corruption (Gorgias 476d-478e).

v. Eudaimonism

The Greek word for happiness is eudaimonia, which signifies not merely feeling a certain way but being a certain way. A different way of translating eudaimonia is well-being. Many scholars believe that Socrates holds two related but not equivalent principles regarding eudaimonia: first, that it is rationally required that a person make his own happiness the foundational consideration for his actions, and second, that each person does in fact pursue happiness as the foundational consideration for his actions. In relation to Socrates’ emphasis on virtue, it is not entirely clear what that means. Virtue could be identical to happiness—in which case there is no difference between the two and if I am virtuous I am by definition happy—virtue could be a part of happiness—in which case if I am virtuous I will be happy although I could be made happier by the addition of other goods—or virtue could be instrumental for happiness—in which case if I am virtuous I might be happy (and I couldn’t be happy without virtue), but there is no guarantee that I will be happy.

There are a number of passages in the Apology that seem to indicate that the greatest good for a human being is having philosophical conversation (36b-d, 37e-38a, 40e-41c). Meno 87c-89a suggests that knowledge of the good guides the soul toward happiness (cf. Euthydemus 278e-282a). And at Gorgias 507a-c Socrates suggests that the virtuous person, acting in accordance with wisdom, attains happiness (cf. Gorgias 478c-e: the happiest person has no badness in his soul).

vi. Ruling is An Expertise

Socrates is committed to the theme that ruling is a kind of craft or art (technē). As such, it requires knowledge. Just as a doctor brings about a desired result for his patient—health, for instance—so the ruler should bring about some desired result in his subject (Republic 341c-d, 342c). Medicine, insofar as it has the best interest of its patient in mind, never seeks to benefit the practitioner. Similarly, the ruler’s job is to act not for his own benefit but for the benefit of the citizens of the political community. This is not to say that there might not be some contingent benefit that accrues to the practitioner; the doctor, for instance, might earn a fine salary. But this benefit is not intrinsic to the expertise of medicine as such. One could easily conceive of a doctor that makes very little money. One cannot, however, conceive of a doctor that does not act on behalf of his patient. Analogously, ruling is always for the sake of the ruled citizen, and justice, contra the famous claim from Thrasymachus, is not whatever is in the interest of the ruling power (Republic 338c-339a).

d. Socrates the Ironist

The suspicion that Socrates is an ironist can mean a number of things: on the one hand, it can indicate that Socrates is saying something with the intent to convey the opposite meaning. Some readers for instance, including a number in the ancient world, understood Socrates’ avowal of ignorance in precisely this way. Many have interpreted Socrates’ praise of Euthyphro, in which he claims that he can learn from him and will become his pupil, as an example of this sort of irony (Euthyphro 5a-b). On the other hand, the Greek word eirōneia was understood to carry with it a sense of subterfuge, rendering the sense of the word something like masking with the intent to deceive.

Additionally, there are a number of related questions about Socrates’ irony. Is the interlocutor supposed to be aware of the irony, or is he ignorant of it? Is it the job of the reader to discern the irony? Is the purpose of irony rhetorical, intended to maintain Socrates’ position as the director of the conversation, or pedagogical, meant to encourage the interlocutor to learn something? Could it be both?

Scholars disagree on the sense in which we ought to call Socrates ironic. When Socrates asks Callicles to tell him what he means by the stronger and to go easy on him so that he might learn better, Callicles claims he is being ironic (Gorgias 489e). Thrasymachus accuses Socrates of being ironic insofar as he pretends he does not have an account of justice, when he is actually hiding what he truly thinks (Republic 337a). And though the Symposium is generally not thought to be a “Socratic” dialogue, we there find Alcibiades accusing Socrates of being ironic insofar as he acts like he is interested in him but then deny his advances (Symposium 216e, 218d). It is not clear which kind of irony is at work with these examples.

Aristotle defines irony as an attempt at self-deprecation (Nicomachean Ethics 4.7, 1127b23-26). He argues that self-deprecation is the opposite of boastfulness, and people that engage in this sort of irony do so to avoid pompousness and make their characters more attractive. Above all, such people disclaim things that bring reputation. On this reading, Socrates was prone to understatement.

There are some thinkers for whom Socratic irony is not just restricted to what Socrates says. The 19^th century Danish philosopher Søren Kierkegaard held the view that Socrates himself, his character, is ironic. The 20^th century philosopher Leo Strauss defined irony as the noble dissimulation of one’s worth. On this reading, Socrates’ irony consisted in his refusal to display his superiority in front of his inferiors so that his message would be understood only by the privileged few. As such, Socratic irony is intended to conceal Socrates’ true message.

3. Method: How Did Socrates Do Philosophy?

As famous as the Socratic themes are, the Socratic method is equally famous. Socrates conducted his philosophical activity by means of question an answer, and we typically associate with him a method called the elenchus. At the same time, Plato’s Socrates calls himself a midwife—who has no ideas of his own but helps give birth to the ideas of others—and proceeds dialectically—defined either as asking questions, embracing the practice of collection and division, or proceeding from hypotheses to first principles.

a. The Elenchus: Socrates the Refuter

A typical Socratic elenchus is a cross-examination of a particular position, proposition, or definition, in which Socrates tests what his interlocutor says and refutes it. There is, however, great debate amongst scholars regarding not only what is being refuted but also whether or not the elenchus can prove anything. There are questions, in other words, about the topic of the elenchus and its purpose or goal.

i. Topic

Socrates typically begins his elenchus with the question, “what is it”? What is piety, he asks Euthyphro. Euthyphro appears to give five separate definitions of piety: piety is proceeding against whomever does injustice (5d-6e), piety is what is loved by the gods (6e-7a), piety is what is loved by all the gods (9e), the godly and the pious is the part of the just that is concerned with the care of the gods (12e), and piety is the knowledge of sacrificing and praying (13d-14a). For some commentators, what Socrates is searching for here is a definition. Other commentators argue that Socrates is searching for more than just the definition of piety but seeks a comprehensive account of the nature of piety. Whatever the case, Socrates refutes the answer given to him in response to the ‘what is it’ question.

Another reading of the Socratic elenchus is that Socrates is not just concerned with the reply of the interlocutor but is concerned with the interlocutor himself. According to this view, Socrates is as much concerned with the truth or falsity of propositions as he is with the refinement of the interlocutor’s way of life. Socrates is concerned with both epistemological and moral advances for the interlocutor and himself. It is not propositions or replies alone that are refuted, for Socrates does not conceive of them dwelling in isolation from those that hold them. Thus conceived, the elenchus refutes the person holding a particular view, not just the view. For instance, Socrates shames Thrasymachus when he shows him that he cannot maintain his view that justice is ignorance and injustice is wisdom (Republic I 350d). The elenchus demonstrates that Thrasymachus cannot consistently maintain all his claims about the nature of justice. This view is consistent with a view we find in Plato’s late dialogue called the Sophist, in which the Visitor from Elea, not Socrates, claims that the soul will not get any advantage from learning that it is offered to it until someone shames it by refuting it (230b-d).

ii. Purpose

In terms of goal, there are two common interpretations of the elenchus. Both have been developed by scholars in response to what Gregory Vlastos called the problem of the Socratic elenchus. The problem is how Socrates can claim that position W is false, when the only thing he has established is its inconsistency with other premises whose truth he has not tried to establish in the elenchus.

The first response is what is called the constructivist position. A constructivist argues that the elenchus establishes the truth or falsity of individual answers. The elenchus on this interpretation can and does have positive results. Vlastos himself argued that Socrates not only established the inconsistency of the interlocutor’s beliefs by showing their inconsistency, but that Socrates’ own moral beliefs were always consistent, able to withstand the test of the elenchus. Socrates could therefore pick out a faulty premise in his elenctic exchange with an interlocutor, and sought to replace the interlocutor’s false beliefs with his own.

The second response is called the non-constructivist position. This position claims that Socrates does not think the elenchus can establish the truth or falsity of individual answers. The non-constructivist argues that all the elenchus can show is the inconsistency of W with the premises X, Y, and Z. It cannot establish that ~W is the case, or, for that matter, replace any of the premises with another, for this would require a separate argument. The elenchus establishes the falsity of the conjunction of W, X, Y, and Z, but not the truth or falsity of any of those premises individually. The purpose of the elenchus on this interpretation is to show the interlocutor that he is confused, and, according to some scholars, to use that confusion as a stepping stone on the way to establishing a more consistent, well-formed set of beliefs.

b. Maieutic: Socrates the Midwife

In Plato’s Theaetetus Socrates identifies himself as a midwife (150b-151b). While the dialogue is not generally considered Socratic, it is elenctic insofar as it tests and refutes Theaetetus’ definitions of knowledge. It also ends without a conclusive answer to its question, a characteristic it shares with a number of Socratic dialogues.

Socrates tells Theaetetus that his mother Phaenarete was a midwife (149a) and that he himself is an intellectual midwife. Whereas the craft of midwifery (150b-151d) brings on labor pains or relieves them in order to help a woman deliver a child, Socrates does not watch over the body but over the soul, and helps his interlocutor give birth to an idea. He then applies the elenchus to test whether or not the intellectual offspring is a phantom or a fertile truth. Socrates stresses that both he and actual midwives are barren, and cannot give birth to their own offspring. In spite of his own emptiness of ideas, Socrates claims to be skilled at bringing forth the ideas of others and examining them.

c. Dialectic: Socrates the Constructer

The method of dialectic is thought to be more Platonic than Socratic, though one can understand why many have associated it with Socrates himself. For one thing, the Greek dialegesthai ordinarily means simply “to converse” or “to discuss.” Hence when Socrates is distinguishing this sort of discussion from rhetorical exposition in the Gorgias, the contrast seems to indicate his preference for short questions and answers as opposed to longer speeches (447b-c, 448d-449c).

There are two other definitions of dialectic in the Platonic corpus. First, in the Republic, Socrates distinguishes between dianoetic thinking, which makes use of the senses and assumes hypotheses, and dialectical thinking, which does not use the senses and goes beyond hypotheses to first principles (Republic VII 510c-511c, 531d-535a). Second, in the Phaedrus, Sophist, Statesman, and Philebus, dialectic is defined as a method of collection and division. One collects things that are scattered into one kind and also divides each kind according to its species (Phaedrus 265d-266c).

Some scholars view the elenchus and dialectic as fundamentally different methods with different goals, while others view them as consistent and reconcilable. Some even view them as two parts of one argument procedure, in which the elenchus refutes and dialectic constructs.

4. Legacy: How Have Other Philosophers Understood Socrates?

Nearly every school of philosophy in antiquity had something positive to say about Socrates, and most of them drew their inspiration from him. Socrates also appears in the works of many famous modern philosophers. Immanuel Kant, the 18^th century German philosopher best known for the categorical imperative, hailed Socrates, amongst other ancient philosophers, as someone who didn’t just speculate but who lived philosophically. One of the more famous quotes about Socrates is from John Stuart Mill, the 19^th century utilitarian philosopher who claimed that it is better to be a human being dissatisfied than a pig satisfied; better to be Socrates dissatisfied than a fool satisfied. The following is but a brief survey of Socrates as he is treated in philosophical thinking that emerges after the death of Aristotle in 322 B.C.E.

a. Hellenistic Philosophy

i. The Cynics

The Cynics greatly admired Socrates, and traced their philosophical lineage back to him. One of the first representatives of the Socratic legacy was the Cynic Diogenes of Sinope. No genuine writings of Diogenes have survived and most of our evidence about him is anecdotal. Nevertheless, scholars attribute a number of doctrines to him. He sought to undermine convention as a foundation for ethical values and replace it with nature. He understood the essence of human being to be rational, and defined happiness as freedom and self-mastery, an objective readily accessible to those who trained the body and mind.

ii. The Stoics

There is a biographical story according to which Zeno, the founder of the Stoic school and not the Zeno of Zeno’s Paradoxes, became interested in philosophy by reading and inquiring about Socrates. The Stoics took themselves to be authentically Socratic, especially in defending the unqualified restriction of ethical goodness to ethical excellence, the conception of ethical excellence as a kind of knowledge, a life not requiring any bodily or external advantage nor ruined by any bodily disadvantage, and the necessity and sufficiency of ethical excellence for complete happiness.

Zeno is known for his characterization of the human good as a smooth flow of life. Stoics were therefore attracted to the Socratic elenchus because it could expose inconsistencies—both social and psychological—that disrupted one’s life. In the absence of justification for a specific action or belief, one would not be in harmony with oneself, and therefore would not live well. On the other hand, if one held a position that survived cross-examination, such a position would be consistent and coherent. The Socratic elenchus was thus not just an important social and psychological test, but also an epistemological one. The Stoics held that knowledge was a coherent set of psychological attitudes, and therefore a person holding attitudes that could withstand the elenchus could be said to have knowledge. Those with inconsistent or incoherent psychological commitments were thought to be ignorant.

Socrates also figures in Roman Stoicism, particularly in the works of Seneca and Epictetus. Both men admired Socrates’ strength of character. Seneca praises Socrates for his ability to remain consistent unto himself in the face of the threat posed by the Thirty Tyrants, and also highlights the Socratic focus on caring for oneself instead of fleeing oneself and seeking fulfillment by external means. Epictetus, when offering advice about holding to one’s own moral laws as inviolable maxims, claims, “though you are not yet a Socrates, you ought, however, to live as one desirous of becoming a Socrates” (Enchiridion 50).

One aspect of Socrates to which Epictetus was particularly attracted was the elenchus. Though his understanding of the process is in some ways different from Socrates’, throughout his Discourses Epictetus repeatedly stresses the importance of recognition of one’s ignorance (2.17.1) and awareness of one’s own impotence regarding essentials (2.11.1). He characterizes Socrates as divinely appointed to hold the elenctic position (3.21.19) and associates this role with Socrates’ protreptic expertise (2.26.4-7). Epictetus encouraged his followers to practice the elenchus on themselves, and claims that Socrates did precisely this on account of his concern with self-examination (2.1.32-3).

iii. The Skeptics

Broadly speaking, skepticism is the view that we ought to be either suspicious of claims to epistemological truth or at least withhold judgment from affirming absolute claims to knowledge. Amongst Pyrrhonian skeptics, Socrates appears at times like a dogmatist and at other times like a skeptic or inquirer. On the one hand, Sextus Empiricus lists Socrates as a thinker who accepts the existence of god (Against the Physicists, I.9.64) and then recounts the cosmological argument that Xenophon attributes to Socrates (Against the Physicists, I.9.92-4). On the other hand, in arguing that human being is impossible to conceive, Sextus Empiricus cites Socrates as unsure whether or not he is a human being or something else (Outlines of Pyrrhonism 2.22). Socrates is also said to have remained in doubt about this question (Against the Professors 7.264).

Academic skeptics grounded their position that nothing can be known in Socrates’ admission of ignorance in the Apology (Cicero, On the Orator 3.67, Academics 1.44). Arcesilaus, the first head of the Academy to take it toward a skeptical turn, picked up from Socrates the procedure of arguing, first asking others to give their positions and then refuting them (Cicero, On Ends 2.2, On the Orator 3.67, On the Nature of the Gods 1.11). While the Academy would eventually move away from skepticism, Cicero, speaking on behalf of the Academy of Philo, makes the claim that Socrates should be understood as endorsing the claim that nothing, other than one’s own ignorance, could be known (Academics 2.74).

iv. The Epicurean

The Epicureans were one of the few schools that criticized Socrates, though many scholars think that this was in part because of their animus toward their Stoic counterparts, who admired him. In general, Socrates is depicted in Epicurean writings as a sophist, rhetorician, and skeptic who ignored natural science for the sake of ethical inquiries that concluded without answers. Colotes criticizes Socrates’ statement in the Phaedrus (230a) that he does not know himself (Plutarch, Against Colotes 21 1119b), and Philodemus attacks Socrates’ argument in the Protagoras (319d) that virtue cannot be taught (Rhetoric I 261, 8ff).

The Epicureans wrote a number of books against several of Plato’s Socratic dialogues, including the Lysis, Euthydemus, and Gorgias. In the Gorgias we find Socrates suspicious of the view that pleasure is intrinsically worthy and his insistence that pleasure is not the equivalent of the good (Gorgias 495b-499b). In defining pleasure as freedom from disturbance (ataraxia) and defining this sort of pleasure as the sole good for human beings, the Epicureans shared little with the unbridled hedonism Socrates criticizes Callicles for embracing. Indeed, in the Letter to Menoeceus, Epicurus explicitly argues against pursuing this sort of pleasure (131-132). Nonetheless, the Epicureans did equate pleasure with the good, and the view that pleasure is not the equivalent of the good could not have endeared Socrates to their sentiment.

Another reason for the Epicurean refusal to praise Socrates or make him a cornerstone of their tradition was his perceived irony. According to Cicero, Epicurus was opposed to Socrates’ representing himself as ignorant while simultaneously praising others like Protagoras, Hippias, Prodicus, and Gorgias (Rhetoric, Vol. II, Brutus 292). This irony for the Epicureans was pedagogically pointless: if Socrates had something to say, he should have said it instead of hiding it.

v. The Peripatetics

Aristotle’s followers, the Peripatetics, either said little about Socrates or were pointedly vicious in their attacks. Amongst other things, the Peripatetics accused Socrates of being a bigamist, a charge that appears to have gained so much traction that the Stoic Panaetius wrote a refutation of it (Plutarch, Aristides 335c-d). The general peripatetic criticism of Socrates, similar in one way to the Epicureans, was that he concentrated solely on ethics, and that this was an unacceptable ideal for the philosophical life.

b. Modern Philosophy

i. Hegel

In Socrates, Hegel found what he called the great historic turning point (Philosophy of History, 448). With Socrates, Hegel claims, two opposed rights came into collision: the individual consciousness and the universal law of the state. Prior to Socrates, morality for the ancients was present but it was not present Socratically. That is, the good was present as a universal, without its having had the form of the conviction of the individual in his consciousness (407). Morality was present as an immediate absolute, directing the lives of citizens without their having reflected upon it and deliberated about it for themselves. The law of the state, Hegel claims, had authority as the law of the gods, and thus had a universal validity that was recognized by all (408).

In Hegel’s view the coming of Socrates signals a shift in the relationship between the individual and morality. The immediate now had to justify itself to the individual consciousness. Hegel thus not only ascribes to Socrates the habit of asking questions about what one should do but also about the actions that the state has prescribed. With Socrates, consciousness is turned back within itself and demands that the law should establish itself before consciousness, internal to it, not merely outside it (408-410). Hegel attributes to Socrates a reflective questioning that is skeptical, which moves the individual away from unreflective obedience and into reflective inquiry about the ethical standards of one’s community.

Generally, Hegel finds in Socrates a skepticism that renders ordinary or immediate knowledge confused and insecure, in need of reflective certainty which only consciousness can bring (370). Though he attributes to the sophists the same general skeptical comportment, in Socrates Hegel locates human subjectivity at a higher level. With Socrates and onward we have the world raising itself to the level of conscious thought and becoming object for thought. The question as to what Nature is gives way to the question about what Truth is, and the question about the relationship of self-conscious thought to real essence becomes the predominant philosophical issue (450-1).

ii. Kierkegaard

Kierkegaard’s most well recognized views on Socrates are from his dissertation, The Concept of Irony With Continual Reference to Socrates. There, he argues that Socrates is not the ethical figure that the history of philosophy has thought him to be, but rather an ironist in all that he does. Socrates does not just speak ironically but is ironic. Indeed, while most people have found Aristophanes’ portrayal of Socrates an obvious exaggeration and caricature, Kierkegaard goes so far as to claim that he came very close to the truth in his depiction of Socrates. He rejects Hegel’s picture of Socrates ushering in a new era of philosophical reflection and instead argues that the limits of Socratic irony testified to the need for religious faith. As opposed to the Hegelian view that Socratic irony was an instrument in the service of the development of self-consciousness, Kierkegaard claims that irony was Socrates’ position or comportment, and that he did not have any more than this to give.

Later in his writing career Kierkegaard comes to think that he has neglected Socrates’ significance as an ethical and religious figure. In his final essay entitled My Task, Kierkegaard claims that his mission is a Socratic one; that is, in his task to reinvigorate a Christianity that remained the cultural norm but had, in Kierkegaard’s eyes, nearly ceased altogether to be practiced authentically, Kierkegaard conceives of himself as a kind of Christian Socrates, rousing Christians from their complacency to a conception of Christian faith as the highest, most passionate expression of individual subjectivity. Kierkegaard therefore sees himself as a sort of Christian gadfly. The Socratic call to become aware of one’s own ignorance finds its parallel in the Kierkegaardian call to recognize one’s own failing to truly live as a Christian. The Socratic claim to ignorance—while Socrates is closer to knowledge than his contemporaries—is replaced by the Kierkegaard’s claim that he is not a Christian—though certainly more so than his own contemporaries.

iii. Nietzsche

Nietzsche’s most famous account of Socrates is his scathing portrayal in The Birth of Tragedy, in which Socrates and rational thinking lead to the emergence of an age of decadence in Athens. The delicate balance in Greek culture between the Apollonian—order, calmness, self-control, restraint—and the Dionysian—chaos, revelry, self-forgetfulness, indulgence— initially represented on stage in the tragedies of Aeschylus and Sophocles, gave way to the rationalism of Euripides. Euripides, Nietzsche argues, was only a mask for the newborn demon called Socrates (section 12). Tragedy—and Greek culture more generally—was corrupted by “aesthetic Socratism”, whose supreme law, Nietzsche argues, was that ‘to be beautiful everything must be intelligible’. Whereas the former sort of tragedy absorbed the spectator in the activities and sufferings of its chief characters, the emergence of Socrates heralded the onset of a new kind of tragedy in which this identification is obstructed by the spectators having to figure out the meaning and presuppositions of the characters’ suffering.

Nietzsche continues his attack on Socrates later in his career in Twilight of the Idols. Socrates here represents the lowest class of people (section 3), and his irony consists in his being an exaggeration at the same time as he conceals himself (4). He is the inventor of dialectic (5) which he wields mercilessly because, being an ugly plebeian, he had no other means of expressing himself (6) and therefore employed question and answer to render his opponent powerless (7). Socrates turned dialectic into a new kind of contest (8), and because his instincts had turned against each other and were in anarchy (9), he established the rule of reason as a counter-tyrant in order not to perish (10). Socrates’ decadence here consists in his having to fight his instincts (11). He was thus profoundly anti-life, so much so that he wanted to die (12).

Nonetheless, while Nietzsche accuses Socrates of decadence, he nevertheless recognizes him as a powerful individual, which perhaps accounts for why we at times find in Nietzsche a hesitant admiration of Socrates. He calls Socrates one of the very greatest instinctive forces (The Birth of Tragedy, section 13), labels him as a “free spirit” (Human, All Too Human I, 433) praises him as the first “philosopher of life” in his 17^th lecture on the Preplatonics, and anoints him a ‘virtuoso of life’ in his notebooks from 1875. Additionally, contra Twilight of the Idols, in Thus Spoke Zarathustra, Nietzsche speaks of a death in which one’s virtue still shines, and some commentators have seen in this a celebration of the way in which Socrates died.

iv. Heidegger

Heidegger finds in Socrates a kinship with his own view that the truth of philosophy lies in a certain way of seeing things, and thus is identical with a particular kind of method. He attributes to Socrates the view that the truth of some subject matter shows itself not in some definition that is the object or end of a process of inquiry, but in the very process of inquiry itself. Heidegger characterizes the Socratic method as a kind of productive negation: by refuting that which stands in front of it—in Socrates’ case, an interlocutor’s definition—it discloses the positive in the very process of questioning. Socrates is not interested in articulating propositions about piety but rather concerned with persisting in a questioning relation to it that preserves its irreducible sameness. Behind multiple examples of pious action is Piety, and yet Piety is not something that can be spoken of. It is that which discloses itself through the process of silent interrogation.

It is precisely in his emphasis on silence that Heidegger diverges from Socrates. Where Socrates insisted on the give and take of question and answer, Heideggerian questioning is not necessarily an inquiry into the views of others but rather an openness to the truth that one maintains without the need to speak. To remain in dialogue with a given phenomenon is not the same thing as conversing about it, and true dialogue is always silent.

v. Gadamer

As Heidegger’s student, Gadamer shares his fundamental view that truth and method cannot be divorced in philosophy. At the same time, his hermeneutics leads him to argue for the importance of dialectic as conversation. Gadamer claims that whereas philosophical dialectic presents the whole truth by superceding all its partial propositions, hermeneutics too has the task of revealing a totality of meaning in all its relations. The distinguishing characteristic of Gadamer’s hermeneutical dialectic is that it recognizes radical finitude: we are always already in an open-ended dialogical situation. Conversation with the interlocutor is thus not a distraction that leads us away from seeing the truth but rather is the site of truth. It is for this reason that Gadamer claims Plato communicated his philosophy only in dialogues: it was more than just homage to Socrates, but was a reflection of his view that the word find its confirmation in another and in the agreement of another.

Gadamer also sees in the Socratic method an ethical way of being. That is, he does not just think that Socrates converses about ethics but that repeated Socratic conversation is itself indicative of an ethical comportment. On this account, Socrates knows the good not because he can give some final definition of it but rather because of his readiness to give an account of it. The problem of not living an examined life is not that we might live without knowing what is ethical, but because without asking questions as Socrates does, we will not be ethical.

5. References and Further Reading

Ahbel-Rappe, Sara, and Rachana Kamtekar (eds.), A Companion to Socrates (Oxford: Blackwell, 2006).
Arrowsmith, William, Lattimore, Richmond, and Parker, Douglass (trans.), Four Plays by Aristophanes: The Clouds, The Birds, Lysistrata, The Frogs (New York: Meridian, 1994).
Barnes, Jonathan, Complete Works of Aristotle vols. 1 & 2 (Princeton: Princeton University Press, 1984).
Benson, Hugh H. (ed.), Essays on the Philosophy of Socrates (New York: Oxford University Press, 1992).
Brickhouse, Thomas C. & Smith, Nicholas D., Plato’s Socrates (Oxford: Oxford University Press, 1994).
Burkert, Walter, Greek Religion (Cambridge: Harvard University Press, 1985).
Cooper, John M., Plato: Collected Works (Princeton: Princeton University Press, 1997).
Guthrie, W.K.C., Socrates (Cambridge: Cambridge University Press, 1971).
Kahn, Charles H., Plato and the Socratic Dialogue (Cambridge: Cambridge University Press, 1996).
Kraut, Richard (ed.), The Cambridge Companion to Plato (Cambridge: Cambridge University Press, 1992).
Morrison, Donald R., The Cambridge Companion to Socrates (Cambridge: Cambridge University Press, 2012).
Rudebusch, George, Socrates (Malden, MA: Wiley-Blackwell, 2009).
Santas, Gerasimos, Socrates: Philosophy in Plato’s Early Dialogues (London: Routledge & Kegan Paul, 1979).
Taylor, C.C.W, 1998, Socrates (Oxford: Oxford University Press, 1998).
Vlastos, Gregory, Socrates, Ironist and Moral Philosopher (Cambridge: Cambridge University Press, 1991).
Xenophon: Memorabilia. Oeconomicus. Symposium. Apologia. (Loeb Classical Library, Cambridge: Harvard University Press, 1923).

Author Information

James M. Ambury
Email: jamesambury@kings.edu
King’s College
U. S. A.

John Locke (1632—1704)

John Locke was among the most famous philosophers and political theorists of the 17^th century. He is often regarded as the founder of a school of thought known as British Empiricism, and he made foundational contributions to modern theories of limited, liberal government. He also was influential in the areas of theology, religious toleration, and educational theory. In his most important work, the Essay Concerning Human Understanding, Locke set out to offer an analysis of the human mind and its acquisition of knowledge. He offered an empiricist theory according to which we acquire ideas through our experience of the world. The mind is then able to examine, compare, and combine these ideas in numerous different ways. Knowledge consists of a special kind of relationship between different ideas. Locke’s emphasis on the philosophical examination of the human mind as a preliminary to the philosophical investigation of the world and its contents represented a new approach to philosophy, one which quickly gained a number of converts, especially in Great Britain. In addition to this broader project, the Essay contains a series of more focused discussions on important, and widely divergent, philosophical themes. In politics, Locke is best known as a proponent of limited government. He uses a theory of natural rights to argue that governments have obligations to their citizens, have only limited powers over their citizens, and can ultimately be overthrown by citizens under certain circumstances. He also provided powerful arguments in favor of religious toleration. This article attempts to give a broad overview of all key areas of Locke’s thought.

Life and Works
The Main Project of the Essay
Special Topics in the Essay
Political Philosophy
Theology
Education
Locke’s Influence
References and Further Reading
1. Locke’s Works
2. Recommended Reading

1. Life and Works

John Locke was born in 1632 in Wrington, a small village in southwestern England. His father, also named John, was a legal clerk and served with the Parliamentary forces in the English Civil War. His family was well-to-do, but not of particularly high social or economic standing. Locke spent his childhood in the West Country and as a teenager was sent to Westminster School in London.

Locke was successful at Westminster and earned a place at Christ Church, Oxford. He was to remain in Oxford from 1652 until 1667. Although he had little appreciation for the traditional scholastic philosophy he learned there, Locke was successful as a student and after completing his undergraduate degree he held a series of administrative and academic posts in the college. Some of Locke’s duties included instruction of undergraduates. One of his earliest substantive works, the Essays on the Law of Nature, was developed in the course of his teaching duties. Much of Locke’s intellectual effort and energy during his time at Oxford, especially during his later years there, was devoted to the study of medicine and natural philosophy (what we would now call science). Locke read widely in these fields, participated in various experiments, and became acquainted with Robert Boyle and many other notable natural philosophers. He also undertook the normal course of education and training to become a physician.

Locke left Oxford for London in 1667 where he became attached to the family of Anthony Ashley Cooper (then Lord Ashley, later the Earl of Shaftesbury). Locke may have played a number of roles in the household, mostly likely serving as tutor to Ashley’s son. In London, Locke continued to pursue his interests in medicine and natural philosophy. He formed a close working relationship with Thomas Sydenham, who later became one the most famous physicians of the age. He made a number of contacts within the newly formed Royal Society and became a member in 1668. He also acted as the personal physician to Lord Ashley. Indeed, on one occasion Locke participated in a very delicate surgical operation which Ashley credited with saving his life. Ashley was one of the most prominent English politicians at the time. Through his patronage Locke was able to hold a series of governmental posts. Most of his work related to policies in England’s American and Caribbean colonies. Most importantly, this was the period in Locke’s life when he began the project which would culminate in his most famous work, the Essay Concerning Human Understanding. The two earliest drafts of that work date from 1671. He was to continue work on this project intermittentlyfor nearly twenty years.

Locke travelled in France for several years starting in 1675. When he returned to England it was only to be for a few years. The political scene had changed greatly while Locke was away. Shaftesbury (as Ashley was now known) was out of favor and Locke’s association with him had become a liability. It was around this time that Locke composed his most famous political work, the Two Treatises Concerning Government. Although the Two Treatises would not be published until 1689 they show that he had already solidified his views on the nature and proper form of government. Following Shaftesbury’s death Locke fled to the Netherlands to escape political persecution. While there Locke travelled a great deal (sometimes for his own safety) and worked on two projects. First, he continued work on the Essay. Second, he wrote a work entitled Epistola de Tolerantia, which was published anonymously in 1689. Locke’s experiences in England, France, and the Netherlands convinced him that governments should be much more tolerant of religious diversity than was common at the time.

Following the Glorious Revolution of 1688-1689 Locke was able to return to England. He published both the Essay and the Two Treatises (the second anonymously) shortly after his return. He initially stayed in London but soon moved to the home of Francis and Damaris Masham in the small village of Oates, Essex. Damaris Masham, who was the daughter of a notable philosopher named Ralph Cudworth, had become acquainted with Locke several years before. The two formed a very close friendship which lasted until Locke’s death. During this period Locke kept busy working on politics, toleration, philosophy, economics, and educational theory.

Locke engaged in a number of controversies during his life, including a notable one with Jonas Proast over toleration. But Locke’s most famous and philosophically important controversy was with Edward Stillingfleet, the Bishop of Worcester. Stillingfleet, in addition to being a powerful political and theological figure, was an astute and forceful critic. The two men debated a number of the positions in the Essay in a series of published letters.

In his later years Locke devoted much of his attention to theology. His major work in this field was The Reasonableness of Christianity, published (again anonymously) in 1695. This work was controversial because Locke argued that many beliefs traditionally believed to be mandatory for Christians were unnecessary. Locke argued for a highly ecumenical form of Christianity. Closer to the time of his death Locke wrote a work on the Pauline Epistles. The work was unfinished, but published posthumously. A short work on miracles also dates from this time and was published posthumously.

Locke suffered from health problems for most of his adult life. In particular, he had respiratory ailments which were exacerbated by his visits to London where the air quality was very poor. His health took a turn for the worse in 1704 and he became increasingly debilitated. He died on 28 October 1704 while Damaris Masham was reading him the Psalms. He was buried at High Laver, near Oates. He wrote his own epitaph which was both humble and forthright.

2. The Main Project of the Essay

According to Locke’s own account the motivation for writing the Essay came to him while debating an unrelated topic with friends. He reports that they were able to make little headway on this topic and that they very quickly met with a number of confusions and difficulties. Locke realized that to make progress on this topic it was first necessary to examine something more fundamental: the human understanding. It was “necessary to examine our own Abilities, and see, what Objects our Understandings were, or were not fitted to deal with.” (Epistle, 7).

Locke’s insight was that before we can analyze the world and our access to it we have to know something about ourselves. We need to know how we acquire knowledge. We also need to know which areas of inquiry we are well suited to and which are epistemically closed to us, that is, which areas are such that we could not know them even in principle. We further need to know what knowledge consists in. In keeping with these questions, at the very outset of the Essay Locke writes that it is his “Purpose to enquire into the Original, Certainty, and Extent of humane Knowledge; together, with the Grounds and Degrees of Belief, Opinion, and Assent.” (1.1.2, 42). Locke thinks that it is only once we understand our cognitive capabilities that we can suitably direct our researches into the world. This may have been what Locke had in mind when he claimed that part of his ambition in the Essay was to be an “Under-Laborer” who cleared the ground and laid the foundations for the work of famous scientists like Robert Boyle and Isaac Newton.

The Essay is divided into four books with each book contributing to Locke’s overall goal of examining the human mind with respect to its contents and operations. In Book I Locke rules out one possible origin of our knowledge. He argues that our knowledge cannot have been innate. This sets up Book II in which Locke argues that all of our ideas come from experience. In this book he seeks to give an account of how even ideas like God, infinity, and space could have been acquired through our perceptual access to the world and our mental operations. Book III is something of a digression as Locke turns his attention to language and the role it plays in our theorizing. Locke’s main goal here is cautionary, he thinks language is often an obstacle to understanding and he offers some recommendations to avoid confusion. Finally, Book IV discusses knowledge, belief, and opinion. Locke argues that knowledge consists of special kinds of relations between ideas and that we should regulate our beliefs accordingly.

a. Ideas

The first chapter of the Essay contains an apology for the frequent use of the word “idea” in the book. According to Locke, ideas are the fundamental units of mental content and so play an integral role in his explanation of the human mind and his account of our knowledge. Locke was not the first philosopher to give ideas a central role; Descartes, for example, had relied heavily on them in explaining the human mind. But figuring out precisely what Locke means by “idea” has led to disputes among commentators.

One place to begin is with Locke’s own definition. He claims that by “idea” he means “whatsoever is the Object of the Understanding when a Man thinks…whatever is meant by Phantasm, Notion, Species, or whatever it is, which the Mind can be employ’d about in thinking.” (1.1.8, 47). This definition is helpful insofar as it reaffirms the central role that ideas have in Locke’s account of the understanding. Ideas are the sole entities upon which our minds work. Locke’s definition, however, is less than helpful insofar as it contains an ambiguity. On one reading, ideas are mental objects. The thought is that when an agent perceives an external world object like an apple there is some thing in her mind which represents that apple. So when an agent considers an apple what she is really doing is thinking about the idea of that apple. On a different reading, ideas are mental actions. The thought here is that when an agent perceives an apple she is really perceiving the apple in a direct, unmediated way. The idea is the mental act of making perceptual contact with the external world object. In recent years, most commentators have adopted the first of these two readings. But this debate will be important in the discussion of knowledge below.

b. The Critique of Nativism

The first of the Essay’s four books is devoted to a critique of nativism, the doctrine that some ideas are innate in the human mind, rather than received in experience. It is unclear precisely who Locke’s targets in this book are, though Locke does cite Herbert of Cherbury and other likely candidates include René Descartes, the Cambridge Platonists, and a number of lesser known Anglican theologians. Finding specific targets, however, might not be that important given that much of what Locke seeks to do in Book I is motivate and make plausible the alternative account of idea acquisition that he offers in Book II.

The nativist view which Locke attacks in Book I holds that human beings have mental content which is innate in the mind. This means that there are certain ideas (units of mental content) which were neither acquired via experience nor constructed by the mind out of ideas received in experience. The most popular version of this position holds that there are certain ideas which God planted in all minds at the moment of their creation.

Locke attacks both the view that we have any innate principles (for example, the whole is greater than the part, do unto others as you would have done unto you, etc.) as well as the view that there are any innate singular ideas (for example, God, identity, substance, and so forth). The main thrust of Locke’s argument lies in pointing out that none of the mental content alleged to be innate is universally shared by all humans. He notes that children and the mentally disabled, for example, do not have in their minds an allegedly innate complex thought like “equals taken from equals leave equals”. He also uses evidence from travel literature to point out that many non-Europeans deny what were taken to be innate moral maxims and that some groups even lack the idea of a God. Locke takes the fact that not all humans have these ideas as evidence that they were not implanted by God in humans minds, and that they are therefore acquired rather than innate.

There is one misunderstanding which it is important to avoid when considering Locke’s anti-nativism. The misunderstanding is, in part, suggested by Locke’s claim that the mind is like a tabula rasa (a blank slate) prior to sense experience. This makes it sound as though the mind is nothing prior to the advent of ideas. In fact, Locke’s position is much more nuanced. He makes it clear that the mind has any number of inherent capacities, predispositions, and inclinations prior to receiving any ideas from sensation. His anti-nativist point is just that none of these is triggered or exercised until the mind receives ideas from sensation.

c. Idea Acquisition

In Book II Locke offers his alternative theory of how the human mind comes to be furnished with the ideas it has. Every day we think of complex things like orange juice, castles, justice, numbers, and motion. Locke’s claim is that the ultimate origin of all of these ideas lies in experience: “Experience: In that, all our Knowledge is founded; and from that it ultimately derives itself. Our Observation employ’d either about external, sensible Objects; or about the internal Operations of our Minds, perceived and reflected on by ourselves, is that, which supplies our Understandings with all the material of thinking. These two are the Fountains of Knowledge, from whence all the Ideas we have, or can naturally have, do spring.” (2.1.2, 104).

In the above passage Locke allows for two distinct types of experience. Outer experience, or sensation, provides us with ideas from the traditional five senses. Sight gives us ideas of colors, hearing gives us ideas of sounds, and so on. Thus, my idea of a particular shade of green is a product of seeing a fern. And my idea of a particular tone is the product of my being in the vicinity of a piano while it was being played. Inner experience, or reflection, is slightly more complicated. Locke thinks that the human mind is incredibly active; it is constantly performing what he calls operations. For example, I often remember past birthday parties, imagine that I was on vacation, desire a slice of pizza, or doubt that England will win the World Cup. Locke believes that we are able to notice or experience our mind performing these actions and when we do we receive ideas of reflection. These are ideas such as memory, imagination, desire, doubt, judgment, and choice.

Locke’s view is that experience (sensation and reflection) issues us with simple ideas. These are the minimal units of mental content; each simple idea is “in itself uncompounded, [and] contains in it nothing but one uniform Appearance, or Conception in the mind, and is not distinguishable into different Ideas.” (2.2.1, 119). But many of my ideas are not simple ideas. My idea of a glass of orange juice or my idea of the New York subway system, for example, could not be classed a simple ideas. Locke calls ideas like these complex ideas. His view is that complex ideas are the product of combining our simple ideas together in various ways. For example, my complex idea of a glass of orange juice consists of various simple ideas (the color orange, the feeling of coolness, a certain sweet taste, a certain acidic taste, and so forth) combined together into one object. Thus, Locke believes our ideas are compositional. Simple ideas combine to form complex ideas. And these complex ideas can be combined to form even more complex ideas.

We are now in a position to understand the character of Locke’s empiricism. He is committed to the view that all of our ideas, everything we can possibly think of, can be broken down into simple ideas received in experience. The bulk of Book II is devoted to making this empiricism plausible. Locke does this both by undertaking an examination of the various abilities that the human mind has (memory, abstraction, volition, and so forth) and by offering an account of how even abstruse ideas like space, infinity, God, and causation could be constructed using only the simple ideas received in experience.

Our complex ideas are classified into three different groups: substances, modes, and relations. Ideas of substances are ideas of things which are thought to exist independently. Ordinary objects like desks, sheep, and mountains fall into this group. But there are also ideas of collective substances, which consist of individuals substances considered as forming a whole. A group of individual buildings might be considered a town. And a group of individual men and women might be considered together as an army. In addition to describing the way we think about individual substances, Locke also has an interesting discussion of substance-in-general. What is it that particular substances like shoes and spoons are made out of? We could suggest that they are made out of leather and metal. But the question could be repeated, what are leather and metal made of? We might respond that they are made of matter. But even here, Locke thinks we can ask what matter is made of. What gives rise to the properties of matter? Locke claims that we don’t have a very clear idea here. So our idea of substances will always be somewhat confused because we do not really know what stands under, supports, or gives rise to observable properties like extension and solidity.

Ideas of modes are ideas of things which are dependent on substances in some way. In general, this taxonomic category can be somewhat tricky. It does not seem to have a clear parallel in contemporary metaphysics, and it is sometimes thought to be a mere catch-all category for things which are neither substances nor relations. But it is helpful to think of modes as being like features of substances; modes are “such complex Ideas, which however compounded, contain not in them the supposition of subsisting by themselves, but are considered as Dependences on, or Affections of Substances.” (2.12.4, 165). Modes come in two types: simple and mixed. Simple modes are constructed by combining a large number of a single type of simple ideas together. For example, Locke believes there is a simple idea of unity. Our complex idea of the number seven, for example, is a simple mode and is constructed by concatenating seven simple ideas of unity together. Locke uses this category to explain how we think about a number of topics relating to number, space, time, pleasure and pain, and cognition. Mixed modes, on the other hand, involve combining together simple ideas of more than one kind. A great many ideas fall into this category. But the most important ones are moral ideas. Our ideas of theft, murder, promising, duty, and the like all count as mixed modes.

Ideas of relations are ideas that involve more than one substance. My idea of a husband, for example, is more than the idea of an individual man. It also must include the idea of another substance, namely the idea of that man’s spouse. Locke is keen to point out that much more of our thought involves relations than we might previously have thought. For example, when I think about Elizabeth II as the Queen of England my thinking actually involves relations, because I cannot truly think of Elizabeth as a queen without conceiving of her as having a certain relationship of sovereignty to some subjects (individual substances like David Beckham and J.K. Rowling). Locke then goes on to explore the role that relations have in our thinking about causation, space, time, morality, and (very famously) identity.

Throughout his discussion of the different kinds of complex ideas Locke is keen to emphasize that all of our ideas can ultimately be broken down into simple ideas received in sensation and reflection. Put differently, Locke is keenly aware that the success of his empiricist theory of mind depends on its ability to account for all the contents of our minds. Whether or not Locke is successful is a matter of dispute. On some occasions the analysis he gives of how a very complex idea could be constructed using only simple ideas is vague and requires the reader to fill in some gaps. And commentators have also suggested that some of the simple ideas Locke invokes, for example the simple ideas of power and unity, do not seem to be obvious components of our phenomenological experience.

Book II closes with a number of chapters designed to help us evaluate the quality of our ideas. Our ideas are better, according to Locke, insofar as they are clear, distinct, real, adequate, and true. Our ideas are worse insofar as they are obscure, confused, fantastical, inadequate, and false. Clarity and obscurity are explained via an analogy to vision. Clear ideas, like clear images, are crisp and fresh, not faded or diminished in the way that obscure ideas (or images) are. Distinction and confusion have to do with the individuation of ideas. Ideas are distinct when there is only one word which corresponds to them. Confused ideas are ones to which more than one word can correctly apply or ones that lack a clear and consistent correlation to one particular word. To use one of Locke’s examples, an idea of a leopard as a beast with spots would be confused. It is not distinct because the word “lynx” could apply to that idea just as easily as the word “leopard.” Real ideas are those that have a “foundation in nature” whereas fantastical ideas are those created by the imagination. For example, our idea of a horse would be a real idea and our idea of a unicorn would be fantastical. Adequacy and inadequacy have to do with how well ideas match the patterns according to which they were made. Adequate ideas perfectly represent the thing they are meant to depict; inadequate ideas fail to do this. Ideas are true when the mind understands them in a way that is correct according to linguistic practices and the way the world is structured. They are false when the mind misunderstands them along these lines.

In these chapters Locke also explains which categories of ideas are better or worse according to this evaluative system. Simple ideas do very well. Because objects directly produce them in the mind they tend to be clear, distinct, and so forth. Ideas of modes and relations also tend to do very well, but for a different reason. Locke thinks that the archetypes of these ideas are in the mind rather than in the world. As such, it is easy for these ideas to be good because the mind has a clear sense of what the ideas should be like as it constructs them. By contrast, ideas of substances tend to fare very poorly. The archetypes for these ideas are external world objects. Because our perceptual access to these objects is limited in a number of ways and because these objects are so intricate, ideas of substances tend to be confused, inadequate, false, and so forth.

d. Language

Book III of the Essay is concerned with language. Locke admits that this topic is something of a digression. He did not originally plan for language to take up an entire book of the Essay. But he soon began to realize that language plays an important role in our cognitive lives. Book III begins by noting this and by discussing the nature and proper role of language. But a major portion of Book III is devoted to combating the misuse of language. Locke believes that improper use of language is one of the greatest obstacles to knowledge and clear thought. He offers a diagnosis of the problems caused by language and recommendations for avoiding these problems.

Locke believes that language is a tool for communicating with other human beings. Specifically, Locke thinks that we want to communicate about our ideas, the contents of our minds. From here it is a short step to the view that: “Words in their primary or immediate Signification, stand for nothing, but the Ideas in the Mind of him that uses them.” (3.2.2, 405). When an agent utters the word “gold” she is referring to her idea of a shiny, yellowish, malleable substance of great value. When she utters the word “carrot” she is referring to her idea of a long, skinny, orange vegetable which grows underground. Locke is, of course, aware that the names we choose for these ideas are arbitrary and merely a matter of social convention.

Although the primary use of words is to refer to ideas in the mind of the speaker, Locke also allows that words make what he calls “secret reference” to two other things. First, humans also want their words to refer to the corresponding ideas in the minds of other humans. When Smith says “carrot” within earshot of Jones her hope is that Jones also has an idea of the long, skinny vegetable and that saying “carrot” will bring that idea into Jones’ mind. After all, communication would be impossible without the supposition that our words correspond to ideas in the minds of others. Second, humans suppose that their words stand for objects in the world. When Smith says “carrot” she wants to refer to more than just her idea, she also wants to refer to the long skinny objects themselves. But Locke is suspicious of these two other ways of understanding signification. He thinks the latter one, in particular, is illegitimate.

After discussing these basic features of language and reference Locke goes on to discuss specific cases of the relationship between ideas and words: words used for simple ideas, words used for modes, words used for substances, the way in which a single word can refer to a multiplicity of ideas, and so forth. There is also an interesting chapter on “particles.” These are words which do not refer to an idea but instead refer to a certain connection which holds between ideas. For example, if I say “Secretariat is brown” the word “Secretariat” refers to my idea of a certain racehorse, and “brown” refers to my idea of a certain color, but the word “is” does something different. That word is a particle and indicates that I am expressing something about the relationship between my ideas of Secretariat and brown and suggesting that they are connected in a certain way. Other particles includes words like “and”, “but”, “hence”, and so forth.

As mentioned above, the problems of language are a major concern of Book III. Locke thinks that language can lead to confusion and misunderstanding for a number of reasons. The signification of words is arbitrary, rather than natural, and this means it can be difficult to understand which words refer to which ideas. Many of our words stand for ideas which are complex, hard to acquire, or both. So many people will struggle to use those words appropriately. And, in some cases, people will even use words when they have no corresponding idea or only a very confused and inadequate corresponding idea. Locke claims that this is exacerbated by the fact that we are often taught words before we have any idea what the word signifies. A child, for example, might be taught the word “government” at a young age, but it will take her years to form a clear idea of what governments are and how they operate. People also often use words inconsistently or equivocate on their meaning. Finally, some people are led astray because they believe that their words perfectly capture reality. Recall from above that people secretly and incorrectly use their words to refer to objects in the external world. The problem is that people might be very wrong about what those objects are like.

Locke thinks that a result of all this is that people are seriously misusing language and that many debates and discussions in important fields like science, politics, and philosophy are confused or consist of merely verbal disputes. Locke provides a number of examples of language causing problems: Cartesians using “body” and “extension” interchangeably, even though the two ideas are distinct; physiologists who agree on all the facts yet have a long dispute because they have different understandings of the word “liquor”; Scholastic philosophers using the term “prime matter” when they are unable to actually frame an idea of such a thing, and so forth.

The remedies that Locke recommends for fixing these problems created by language are somewhat predictable. But Locke is quick to point out that while they sound like easy fixes they are actually quite difficult to implement. The first and most important step is to only use words when we have clear ideas attached to them. (Again, this sounds easy, but many of us might actually struggle to come up with a clear idea corresponding to even everyday terms like “glory” or “fascist”.) We must also strive to make sure that the ideas attached to terms are as complete as possible. We must strive to ensure that we use words consistently and do not equivocate; every time we utter a word we should use it to signify one and the same idea. Finally, we should communicate our definitions of words to others.

e. The Account of Knowledge

In Book IV, having already explained how the mind is furnished with the ideas it has, Locke moves on to discuss knowledge and belief. A good place to start is with a quote from the beginning of Book IV: “Knowledge then seems to me to be nothing but the perception of the connexion and agreement, or disagreement and repugnancy of any of our Ideas. Where this Perception is, there is Knowledge, and where it is not, there, though we may fancy, guess, or believe, yet we always come short of Knowledge.” (4.2.2, 525). Locke spends the first part of Book IV clarifying and exploring this conception of knowledge. The second part focuses on how we should apportion belief in cases where we lack knowledge.

What does Locke mean by the “connection and agreement” and the “disagreement and repugnancy” of our ideas? Some examples might help. Bring to mind your idea of white and your idea of black. Locke thinks that upon doing this you will immediately perceive that they are different, they “disagree”. It is when you perceive this disagreement that you know the fact that white is not black. Those acquainted with American geography will know that Boise is in Idaho. On Locke’s account of knowledge, this means that they are able to perceive a certain connection that obtains between their idea of Idaho and their idea of Boise. Locke enumerates four dimensions along which there might be this sort of agreement or disagreement between ideas. First, we can perceive when two ideas are identical or non-identical. For example, knowing that sweetness is not bitterness consists in perceiving that the idea of sweetness is not identical to the idea of bitterness. Second, we can perceive relations that obtain between ideas. For example, knowing that 7 is greater than 3 consists in perceiving that there is a size relation of bigger and smaller between the two ideas. Third, we can perceive when our idea of a certain feature accompanies our idea of a certain thing. If I know that ice is cold this is because I perceive that my idea of cold always accompanies my idea of ice. Fourthly, we can perceive when existence agrees with any idea. I can have knowledge of this fourth kind when, for example, I perform the cogito and recognize the special relation between my idea of myself and my idea of existence. Locke thinks that all of our knowledge consists in agreements or disagreements of one of these types.

After detailing the types of relations between ideas which constitute knowledge Locke continues on to discuss three “degrees” of knowledge in 4.2. These degrees seem to consist in different ways of knowing something. The first degree Locke calls intuitive knowledge. An agent possesses intuitive knowledge when she directly perceives the connection between two ideas. This is the best kind of knowledge, as Locke says “Such kind of Truths, the Mind perceives at the first sight of the Ideas together, by bare Intuition, without the intervention of any other Idea; and this kind of knowledge is the clearest, and most certain, that humane Frailty is capable of.” (4.2.1, 531). The second degree of knowledge is called demonstrative. Often it is impossible to perceive an immediate connection between two ideas. For example, most of us are unable to tell that the three interior angles of a triangle are equal to two right angles simply by looking at them. But most of us, with the assistance of a mathematics teacher, can be made to see that they are equal by means of a geometric proof or demonstration. This is the model for demonstrative knowledge. Even if one is unable to directly perceive a relation between idea-X and idea-Y one might perceive a relation indirectly by means of idea-A and idea-B. This will be possible if the agent has intuitive knowledge of a connection between X and A, between A and B, and then between B and Y. Demonstrative knowledge consists, therefore, in a string of relations each of which is known intuitively.

The third degree of knowledge is called sensitive knowledge and has been the source of considerable debate and confusion among Locke commentators. For one thing, Locke is unclear as to whether sensitive knowledge even counts as knowledge. He writes that intuitive and demonstrative knowledge are, properly speaking, the only forms of knowledge, but that “There is, indeed, another Perception of the Mind…which going beyond bare probability, and yet not reaching perfectly to either of the foregoing degrees of certainty, passes under the name of Knowledge.” (4.2.14, 537). Sensitive knowledge has to do with the relationship between our ideas and the objects in the external world that produce them. Locke claims that we can be certain that when we perceive something, an orange, for example, there is an object in the external world which is responsible for these sensations. Part of Locke’s claim is that there is a serious qualitative difference between biting into an orange and remembering biting into an orange. There is something in the phenomenological experience of the former which assures us of a corresponding object in the external world.

Locke spends a fair amount of time in Book IV responding to worries that he is a skeptic or that his account of knowledge, with its emphasis on ideas, fails to be responsive to the external world. The general worry for Locke is fairly simple. By claiming that ideas are the only things humans have epistemic access to, and by claiming that knowledge relates only to our ideas, Locke seems to rule out the claim that we can ever know about the external world. Lockean agents are trapped behind a “veil of ideas.” Thus we cannot have any assurance that our ideas provide us with reliable information about the external world. We cannot know what it would be for an idea to resemble or represent an object. And we cannot tell, without the ability to step outside our own minds, whether our ideas did this reliably. This criticism has historically been thought to endanger Locke’s entire project. Gilbert Ryle’s memorable assessment is that “nearly every youthful student of philosophy both can and does in his second essay refute Locke’s entire Theory of Knowledge.” Recent scholarship has been much more charitable to Locke. But the central problem is still a pressing one.

Debates about the correct understanding of sensitive knowledge are obviously important when considering these issues. At first blush, the relation involved in sensitive knowledge seems to be a relation between an idea and a physical object in the world. But, if this reading is correct, then it becomes difficult to understand the many passages in which Locke insists that knowledge is a relation that holds only between ideas. Also relevant are debates about how to correctly understand Lockean ideas. Recall from above that although many understand ideas as mental objects, some understand them as mental acts. While most of the text seems to favor the first interpretation, it seems that the second interpretation has a significant advantage when responding to these skeptical worries. The reason is that the connection between ideas and external world objects is built right into the definition of an idea. An idea just is a perception of an external world object.

However the debates discussed in the previous paragraph are resolved, there is a consensus among commentators that Locke believes the scope of human understanding is very narrow. Humans are not capable of very much knowledge. Locke discusses this is 4.3, a chapter entitled “Extent of Humane Knowledge.” The fact that our knowledge is so limited should come as no surprise. We have already discussed the ways in which our ideas of substances are problematic. And we have just seen that we have no real understanding of the connection between our ideas and the objects that produce them.

The good news, however, is that while our knowledge might not be very extensive, it is sufficient for our needs. Locke’s memorable nautical metaphor holds that: “’Tis of great use to the Sailor to know the length of his Line, though he cannot with it fathom all the depths of the Ocean. ‘Tis well he knows, that it is long enough to reach the bottom, at such Places, as are necessary to direct his Voyage, and caution him against running upon Shoales, that may ruin him. Our Business here is not to know all things, but those which concern our Conduct.” (1.1.6, 46). Locke thinks we have enough knowledge to live comfortable lives on Earth, to realize that there is a God, to understand morality and behave appropriately, and to gain salvation. Our knowledge of morality, in particular, is very good. Locke even suggests that we might develop a demonstrable system of morality similar to Euclid’s demonstrable system of geometry. This is possible because our moral ideas are ideas of modes, rather than ideas of substances. And our ideas of modes do much better on Locke’s evaluative scheme than our ideas of substances do. Finally, while the limits to our knowledge might be disappointing, Locke notes that recognizing these limits is important and useful insofar as it will help us to better organize our intellectual inquiry. We will be saved from investigating questions which we could never know the answers to and can focus our efforts on areas where progress is possible.

One benefit of Locke’s somewhat bleak assessment of the scope of our knowledge was that it caused him to focus on an area which was underappreciated by many of his contemporaries. This was the arena of judgment or opinion, belief states which fall short of knowledge. Given that we have so little knowledge (that we can be certain of so little) the realm of probability becomes very important. Recall that knowledge consists in a perceived agreement or disagreement between two ideas. Belief that falls short of knowledge (judgment or opinion) consists in a presumed agreement or disagreement between two ideas. Consider an example: I am not entirely sure who the Prime Minister of Canada is, but I am somewhat confident it is Stephen Harper. Locke’s claim is that in judging that the Canadian PM is Stephen Harper I am acting as though a relation holds between the two ideas. I do not directly perceive a connection between my idea of Stephen Harper and my idea of the Canadian PM, but I presume that one exists.

After offering this account of what judgment is, Locke offers an analysis of how and why we form the opinions we do and offers some recommendations for forming our opinions responsibly. This includes a diagnosis of the errors people make in judging, a discussion of the different degrees of assent, and an interesting discussion of the epistemic value of testimony.

3. Special Topics in the Essay

As discussed above, the main project of the Essay is an examination of the human understanding and an analysis of knowledge. But the Essay is a rather expansive work and contains discussion of many other topics of philosophical interest. Some of these will be discussed below. A word of warning, however, is required before proceeding. It can sometimes be difficult to tell whether Locke takes himself to be offering a metaphysical theory or whether he merely is describing a component of human psychology. For example, we might question whether his account of personal identity is meant to give necessary and sufficient conditions for a metaphysical account of personhood or whether it is merely designed to tell us what sorts of identity attributions we do and should make and why. We may further question whether, when discussing primary and secondary qualities, Locke is offering a theory about how perception really works or whether this discussion is a mere digression used to illustrate a point about the nature of our ideas. So while many of these topics have received a great deal of attention, their precise relationship to the main project of the Essay can be difficult to locate.

a. Primary and Secondary Qualities

Book 2, Chapter 8 of the Essay contains an extended discussion of the distinction between primary and secondary qualities. Locke was hardly original in making this distinction. By the time the Essay was published, it had been made by many others and was even somewhat commonplace. That said, Locke’s formulation of the distinction and his analysis of the related issues has been tremendously influential and has provided the framework for much of the subsequent discussion on the topic.

Locke defines a quality as a power that a body has to produce ideas in us. So a simple object like a baked potato which can produce ideas of brownness, heat, ovular shape, solidity, and determinate size must have a series of corresponding qualities. There must be something in the potato which gives us the idea of brown, something in the potato which gives us the idea of ovular shape, and so on. The primary/secondary quality distinction claims that some of these qualities are very different from others.

Locke motivates the distinction between two types of qualities by discussing how a body could produce an idea in us. The theory of perception endorsed by Locke is highly mechanical. All perception occurs as a result of motion and collision. If I smell the baked potato, there must be small material particles which are flying off of the potato and bumping into nerves in my nose, the motion in the nose-nerves causes a chain reaction along my nervous system until eventually there is some motion in my brain and I experience the idea of a certain smell. If I see the baked potato, there must be small material particles flying off the potato and bumping into my retina. That bumping causes a similar chain reaction which ends in my experience of a certain roundish shape.

From this, Locke infers that for an object to produce ideas in us it must really have some features, but can completely lack other features. This mechanical theory of perception requires that objects producing ideas in us have shape, extension, mobility, and solidity. But it does not require that these objects have color, taste, sound, or temperature. So the primary qualities are qualities actually possessed by bodies. These are features that a body cannot be without. The secondary qualities, by contrast, are not really had by bodies. They are just ways of talking about the ideas that can be produced in us by bodies in virtue of their primary qualities. So when we claim that the baked potato is solid, this means that solidity is one of its fundamental features. But when I claim that it smells a certain earthy kind of way, this just means that its fundamental features are capable of producing the idea of the earthy smell in my mind.

These claims lead to Locke’s claims about resemblance: “From whence I think it is easie to draw this Observation, That the Ideas of primary Qualities of Bodies, are Resemblances of them, and their Patterns do really exist in the Bodies themselves; but the Ideas, produced in us by these Secondary Qualities, have no resemblance of them at all.” (2.8.14, 137). Insofar as my idea of the potato is of something solid, extended, mobile, and possessing a certain shape my idea accurately captures something about the real nature of the potato. But insofar as my idea of the potato is of something with a particular smell, temperature, and taste my ideas do not accurately capture mind-independent facts about the potato.

b. Mechanism

Around the time of the Essay the mechanical philosophy was emerging as the predominant theory about the physical world. The mechanical philosophy held that the fundamental entities in the physical world were small individual bodies called corpuscles. Each corpuscle was solid, extended, and had a certain shape. These corpuscles could combine together to form ordinary objects like rocks, tables, and plants. The mechanical philosophy argued that all features of bodies and all natural phenomena could be explained by appeal to these corpuscles and their basic properties (in particular, size, shape, and motion).

Locke was exposed to the mechanical philosophy while at Oxford and became acquainted with the writings of its most prominent advocates. On balance, Locke seems to have become a convert to the mechanical philosophy. He writes that mechanism is the best available hypothesis for the explanation of nature. We have already seen some of the explanatory work done by mechanism in the Essay. The distinction between primary and secondary qualities was a hallmark of the mechanical philosophy and neatly dovetailed with mechanist accounts of perception. Locke reaffirms his commitment to this account of perception at a number of other points in the Essay. And when discussing material objects Locke is very often happy to allow that they are composed of material corpuscles. What is peculiar, however, is that while the Essay does seem to have a number of passages in which Locke supports mechanical explanations and speaks highly of mechanism, it also contains some highly critical remarks about mechanism and discussions of the limits of the mechanical philosophy.

Locke’s critiques of mechanism can be divided into two strands. First, he recognized that there were a number of observed phenomena which mechanism struggled to explain. Mechanism did offer neat explanations of some observed phenomena. For example, the fact that objects could be seen but not smelled through glass could be explained by positing that the corpuscles which interacted with our retinas were smaller than the ones which interacted with our nostrils. So the sight corpuscles could pass through the spaces between the glass corpuscles, but the smell corpuscles would be turned away. But other phenomena were harder to explain. Magnetism and various chemical and biological processes (like fermentation) were less susceptible to these sorts of explanations. And universal gravitation, which Locke took Newton to have proved the existence of in the Principia, was particularly hard to explain. Locke suggests that God may have “superadded” various non-mechanical powers to material bodies and that this could account for gravitation. (Indeed, at several points he even suggests that God may have superadded the power of thought to matter and that humans might be purely material beings.)

Locke’s second set of critiques pertain to theoretical problems in the mechanical philosophy. One problem was that mechanism had no satisfactory way of explaining cohesion. Why do corpuscles sometimes stick together? If things like tables and chairs are just collections of small corpuscles then they should be very easy to break apart, the same way I can easily separate one group of marbles from another. Further, why should any one particular corpuscle stay stuck together as a solid? What accounts for its cohesion? Again, mechanism seems hard-pressed to offer an answer. Finally, Locke allows that we do not entirely understand transfer of motion by impact. When one corpuscle collides with another we actually do not have a very satisfying explanation for why the second moves away under the force of the impact.

Locke presses these critiques with some skill and in a serious manner. Still, ultimately he is guardedly optimistic about mechanism. This somewhat mixed attitude on Locke’s part has led commentators to debate questions about his exact attitude toward the mechanical philosophy and his motivations for discussing it.

c. Volition and Agency

In Book 2, Chapter 21 of the Essay Locke explores the topic of the will. One of the things which separates people from rocks and billiard balls is our ability to make decisions and control our actions. We feel that we are free in certain respects and that we have the power to choose certain thoughts and actions. Locke calls this power the will. But there are tricky questions about what this power consists in and about what it takes to freely (or voluntarily) choose something. 2.21 contains a delicate and sustained discussion of these tricky questions.

Locke first begins with questions of freedom and then proceeds to a discussion of the will. On Locke’s analysis, we are free to do those things which we both will to do and are physically capable of doing. For example, if I wish to jump into a lake and have no physical maladies which prevent it, then I am free to jump into the lake. By contrast, if I do not wish to jump into the lake, but a friend pushes me in, I did not act freely when I entered the water. Or, if I wish to jump into the lake, but have a spinal injury and cannot move my body, then I do not act freely when I stay on the shore. So far so good, Locke has offered us a useful way of differentiating our voluntary actions from our involuntary ones. But there is still a pressing question about freedom and the will: that of whether the will is itself free. When I am deciding whether or not to jump into the water, is the will determined by outside factors to choose one or the other? Or can it, so to speak, make up its own mind and choose either option?

Locke’s initial position in the chapter is that the will is determined. But in later sections he offers a qualification of sorts. In normal circumstances, the will is determined by what Locke calls uneasiness: “What is it that determines the Will in regard to our Actions? … some (and for the most part the most pressing) uneasiness a Man is at present under. That is that which successively determines the Will, and sets us upon those Actions, we perform.” (2.21.31, 250-1). The uneasiness is caused by the absence of something that is perceived as good. The perception of the thing as good gives rise to a desire for that thing. Suppose I choose to eat a slice of pizza. Locke would say I must have made this choice because the absence of the pizza was troubling me somehow (I was feeling hunger pains, or longing for something savory) and this discomfort gave rise to a desire for food. That desire in turn determined my will to choose to eat pizza.

Locke’s qualification to this account of the will being determined by uneasiness has to do with what he calls suspension. Beginning with the second edition of the Essay, Locke began to argue that the most pressing desire for the most part determines the will, but not always: “For the mind having in most cases, as is evident in Experience, a power to suspend the execution and satisfaction of any of its desires, and so all, one after another, is at liberty to consider the objects of them; examine them on all sides, and weigh them with others.” (2.21.47, 263). So even if, at this moment, my desire for pizza is the strongest desire, Locke thinks I can pause before I decide to eat the pizza and consider the decision. I can consider other items in my desire set: my desire to lose weight, or to leave the pizza for my friend, or to keep a vegan diet. Careful consideration of these other possibilities might have the effect of changing my desire set. If I really focus on how important it is to stay fit and healthy by eating nutritious foods then my desire to leave the pizza might become stronger than my desire to eat it and my will may be determined to choose to not eat the pizza. But of course we can always ask whether a person has a choice whether or not to suspend judgment or whether the suspension of judgment is itself determined by the mind’s strongest desire. On this point Locke is somewhat vague. While most interpreters think our desires determine when judgment is suspended, some others disagree and argue that suspension of judgment offers Lockean agents a robust form of free will.

d. Personhood and Personal Identity

Locke was one of the first philosophers to give serious attention to the question of personal identity. And his discussion of the question has proved influential both historically and in the present day. The discussion occurs in the midst of Locke’ larger discussion of the identity conditions for various entities in Book II, Chapter 27. At heart, the question is simple, what makes me the same person as the person who did certain things in the past and that will do certain things in the future? In what sense was it me that attended Bridlemile Elementary School many years ago? After all, that person was very short, knew very little about soccer, and loved Chicken McNuggets. I, on the other hand, am average height, know tons of soccer trivia, and get rather queasy at the thought of eating chicken, especially in nugget form. Nevertheless, it is true that I am identical to the boy who attended Bridlemile.

In Locke’s time, the topic of personal identity was important for religious reasons. Christian doctrine held that there was an afterlife in which virtuous people would be rewarded in heaven and sinful people would be punished in hell. This scheme provided motivation for individuals to behave morally. But, for this to work, it was important that the person who is rewarded or punished is the same person as the one who lived virtuously or lived sinfully. And this had to be true even though the person being rewarded or punished had died, had somehow continued to exist in an afterlife, and had somehow managed to be reunited with a body. So it was important to get the issue of personal identity right.

Locke’s views on personal identity involve a negative project and a positive project. The negative project involves arguing against the view that personal identity consists in or requires the continued existence of a particular substance. And the positive project involves defending the view that personal identity consists in continuity of consciousness. We can begin with this positive view. Locke defines a person as “a thinking intelligent Being, that has reason and reflection, and can consider itself as itself, the same thinking thing in different times and places; which it does only by that consciousness, which is inseparable from thinking, and as it seems to me essential to it.” (2.27.9, 335). Locke suggests here that part of what makes a person the same through time is their ability to recognize past experiences as belonging to them. For me, part of what differentiates one little boy who attended Bridlemile Elementary from all the other children who went there is my realization that I share in his consciousness. Put differently, my access to his lived experience at Bridlemile is very different from my access to the lived experiences of others there: it is first-personal and immediate. I recognize his experiences there as part of a string of experiences that make up my life and join up to my current self and current experiences in a unified way. That is what makes him the same person as me.

Locke believes that this account of personal identity as continuity of consciousness obviates the need for an account of personal identity given in terms of substances. A traditional view held that there was a metaphysical entity, the soul, which guaranteed personal identity through time; wherever there was the same soul, the same person would be there as well. Locke offers a number of thought experiments to cast doubt on this belief and show that his account is superior. For example, if a soul was wiped clean of all its previous experiences and given new ones (as might be the case if reincarnation were true), the same soul would not justify the claim that all of those who had had it were the same person. Or, we could imagine two souls who had their conscious experiences completely swapped. In this case, we would want to say that the person went with the conscious experiences and did not remain with the soul.

Locke’s account of personal identity seems to be a deliberate attempt to move away from some of the metaphysical alternatives and to offer an account which would be acceptable to individuals from a number of different theological backgrounds. Of course, a number of serious challenges have been raised for Locke’s account.. Most of these focus on the crucial role seemingly played by memory. And the precise details of Locke’s positive proposal in 2.27 have been hard to pin down. Nevertheless, many contemporary philosophers believe that there is an important kernel of truth in Locke’s analysis.

e. Real and Nominal Essences

Locke’s distinction between the real essence of a substance and the nominal essence of a substance is one of the most fascinating components of the Essay. Scholastic philosophers had held that the main goal of metaphysics and science was to learn about the essences of things: the key metaphysical components of things which explained all of their interesting features. Locke thought this project was misguided. That sort of knowledge, knowledge of the real essences of beings, was unavailable to human beings. This led Locke to suggest an alternative way to understand and investigate nature; he recommends focusing on the nominal essences of things.

When Locke introduces the term real essence he uses it to refer to the “real constitution of any Thing, which is the foundation of all those Properties, that are combined in, and are constantly found to co-exist with [an object]” (3.6.6, 442). For the Scholastics this real essence would be an object’s substantial form. For proponents of the mechanical philosophy it would be the number and arrangement of the material corpuscles which composed the body. Locke sometimes endorses this latter understanding of real essence. But he insists that these real essences are entirely unknown and undiscoverable by us. The nominal essences, by contrast, are known and are the best way we have to understand individual substances. Nominal essences are just collections of all the observed features an individual thing has. So the nominal essence of a piece of gold would include the ideas of yellowness, a certain weight, malleability, dissolvability in certain chemicals, and so on.

Locke offers us a helpful analogy to illustrate the difference between real and nominal essences. He suggests that our position with respect to ordinary objects is like the position of someone looking at a very complicated clock. The gears, wheels, weights, and pendulum that produce the motions of the hands on the clock face (the clock’s real essence) are unknown to the person. They are hidden behind the casing. He or she can only know about the observable features like the clock’s shape, the movement of the hands, and the chiming of the hours (the clock’s nominal essence). Similarly, when I look at an object like a dandelion, I am only able to observe its nominal essence (the yellow color, the bitter smell, and so forth). I have no clear idea what produces these features of the dandelion or how they are produced.

Locke’s views on real and nominal essences have important consequences for his views about the division of objects into groups and sorts. Why do we consider some things to be zebras and other things to be rabbits? Locke’s view is that we group according to nominal essence, not according to (unknown) real essence. But this has the consequence that our groupings might fail to adequately reflect whatever real distinctions there might be in nature. So Locke is not a realist about species or types. Instead, he is a conventionalist. We project these divisions on the world when we choose to classify objects as falling under the various nominal essences we’ve created.

f. Religious Epistemology

The epistemology of religion (claims about our understanding of God and our duties with respect to him) were tremendously contentious during Locke’s lifetime. The English Civil War, fought during Locke’s youth, was in large part a disagreement over the right way to understand the Christian religion and the requirements of religious faith. Throughout the seventeenth century, a number of fundamentalist Christian sects continually threatened the stability of English political life. And the status of Catholic and Jewish people in England was a vexed one.

So the stakes were very high when, in 4.18, Locke discussed the nature of faith and reason and their respective domains. He defines reason as an attempt to discover certainty or probability through the use of our natural faculties in the investigation of the world. Faith, by contrast, is certainty or probability attained through a communication believed to have come, originally, from God. So when Smith eats a potato chip and comes to believe it is salty, she believes this according to reason. But when Smith believes that Joshua made the sun stand still in the sky because she read it in the Bible (which she takes to be divine revelation), she believes according to faith.

Although it initially sounds as though Locke has carved out quite separate roles for faith and reason, it must be noted that these definitions make faith subordinate to reason in a subtle way. For, as Locke explains: “Whatever GOD hath revealed, is certainly true; no Doubt can be made of it. This is the proper Object of Faith: But whether it be a divine Revelation, or no, Reason must judge; which can never permit the Mind to reject a greater Evidence to embrace what is less evident, nor allow it to entertain Probability in opposition to Knowledge and Certainty.” (4.18.10, 695). First, Locke thinks that if any proposition, even one which purports to be divinely revealed, clashes with the clear evidence of reason then it should not be believed. So, even if it seems like God is telling us that 1+1=3, Locke claims we should go on believing that 1+1=2 and we should deny that the 1+1=3 revelation was genuine. Second, Locke thinks that to determine whether or not something is divinely revealed we have to exercise our reason. How can we tell whether the Bible contains God’s direct revelation conveyed through the inspired Biblical authors or whether it is instead the work of mere humans? Only reason can help us settle that question. Locke thinks that those who ignore the importance of reason in determining what is and is not a matter of faith are guilty of “enthusiasm.” And in a chapter added to later editions of the Essay Locke sternly warns his readers against the serious dangers posed by this intellectual vice.

In all of this Locke emerges as a strong moderate. He himself was deeply religious and took religious faith to be important. But he also felt that there were serious limits to what could be justified through appeals to faith. The issues discussed in this section will be very important below where Locke’s views on the importance of religious toleration are discussed.

4. Political Philosophy

Locke lived during a very eventful time in English politics. The Civil War, Interregnum, Restoration, Exclusion Crisis, and Glorious Revolution all happened during his lifetime. For much of his life Locke held administrative positions in government and paid very careful attention to contemporary debates in political theory. So it is perhaps unsurprising that he wrote a number of works on political issues. In this field, Locke is best known for his arguments in favor of religious toleration and limited government. Today these ideas are commonplace and widely accepted. But in Locke’s time they were highly innovative, even radical.

a. The Two Treatises

Locke’s Two Treatises of Government were published in 1689. It was originally thought that they were intended to defend the Glorious Revolution and William’s seizure of the throne. We now know, however, that they were in fact composed much earlier. Nonetheless, they do lay out a view of government amenable to many of William’s supporters.

The First Treatise is now of primarily historical interest. It takes the form of a detailed critique of a work called Patriacha by Robert Filmer. Filmer had argued, in a rather unsophisticated way, in favor of divine right monarchy. On his view, the power of kings ultimately originated in the dominion which God gave to Adam and which had passed down in an unbroken chain through the ages. Locke disputes this picture on a number of historical grounds. Perhaps more importantly, Locke also distinguishes between a number of different types of dominion or governing power which Filmer had run together.

After clearing some ground in the First Treatise, Locke offers a positive view of the nature of government in the much better known Second Treatise. Part of Locke’s strategy in this work was to offer a different account of the origins of government. While Filmer had suggested that humans had always been subject to political power, Locke argues for the opposite. According to him, humans were initially in a state of nature. The state of nature was apolitical in the sense that there were no governments and each individual retained all of his or her natural rights. People possessed these natural rights (including the right to attempt to preserve one’s life, to seize unclaimed valuables, and so forth) because they were given by God to all of his people.

The state of nature was inherently unstable. Individuals would be under constant threat of physical harm. And they would be unable to pursue any goals that required stability and widespread cooperation with other humans. Locke’s claim is that government arose in this context. Individuals, seeing the benefits which could be gained, decided to relinquish some of their rights to a central authority while retaining other rights. This took the form of a contract. In agreement for relinquishing certain rights, individuals would receive protection from physical harm, security for their possessions, and the ability to interact and cooperate with other humans in a stable environment.

So, according to this view, governments were instituted by the citizens of those governments. This has a number of very important consequences. On this view, rulers have an obligation to be responsive to the needs and desires of these citizens. Further, in establishing a government the citizens had relinquished some, but not all of their original rights. So no ruler could claim absolute power over all elements of a citizen’s life. This carved out important room for certain individual rights or liberties. Finally, and perhaps most importantly, a government which failed to adequately protect the rights and interests of its citizens or a government which attempted to overstep its authority would be failing to perform the task for which it was created. As such, the citizens would be entitled to revolt and replace the existing government with one which would suitably carry out the duties of ensuring peace and civil order while respecting individual rights.

So Locke was able to use the account of natural rights and a government created through contract to accomplish a number of important tasks. He could use it to show why individuals retain certain rights even when they are subject to a government. He could use it to show why despotic governments which attempted to unduly infringe on the rights of their citizens were bad. And he could use it to show that citizens had a right to revolt in instances where governments failed in certain ways. These are powerful ideas which remain important even today.

For more. see the article Political Philosophy.

b. Property

Locke’s Second Treatise on government contains an influential account of the nature of private property. According to Locke, God gave humans the world and its contents to have in common. The world was to provide humans with what was necessary for the continuation and enjoyment of life. But Locke also believed it was possible for individuals to appropriate individual parts of the world and justly hold them for their own exclusive use. Put differently, Locke believed that we have a right to acquire private property.

Locke’s claim is that we acquire property by mixing our labor with some natural resource. For example, if I discover some grapes growing on a vine, through my labor in picking and collecting these grapes I acquire an ownership right over them. If I find an empty field and then use my labor to plow the field then plant and raise crops, I will be the proper owner of those crops. If I chop down trees in an unclaimed forest and use the wood to fashion a table, then that table will be mine. Locke places two important limitations on the way in which property can be acquired by mixing one’s labor with natural resources. First, there is what has come to be known as the Waste Proviso. One must not take so much property that some of it goes to waste. I should not appropriate gallons and gallons of grapes if I am only able to eat a few and the rest end up rotting. If the goods of the Earth were given to us by God, it would be inappropriate to allow some of this gift to go to waste. Second, there is the Enough-And-As-Good Proviso. This says that in appropriating resources I am required to leave enough and as good for others to appropriate. If the world was left to us in common by God, it would be wrong of me to appropriate more than my fair share and fail to leave sufficient resources for others.

After currency is introduced and after governments are established the nature of property obviously changes a great deal. Using metal, which can be made into coins and which does not perish the way foodstuffs and other goods do, individuals are able to accumulate much more wealth than would be possible otherwise. So the proviso concerning waste seems to drop away. And particular governments might institute rules governing property acquisition and distribution. Locke was aware of this and devoted a great deal of thought to the nature of property and the proper distribution of property within a commonwealth. His writings on economics, monetary policy, charity, and social welfare systems are evidence of this. But Locke’s views on property inside of a commonwealth have received far less attention than his views on the original acquisition of property in the state of nature.

c. Toleration

Locke had been systematically thinking about issues relating to religious toleration since his early years in London and even though he only published his Epistola de Tolerantia (A Letter Concerning Toleration) in 1689 he had finished writing it several years before. The question of whether or not a state should attempt to prescribe one particular religion within the state, what means states might use to do so, and what the correct attitude should be toward those who resist conversion to the official state religion had been central to European politics ever since the Protestant Reformation. Locke’s time in England, France, and the Netherlands had given him experiences of three very different approaches to these questions. These experiences had convinced him that, for the most part, individuals should be allowed to practice their religion without interference from the state. Indeed, part of the impetus for the publication of Locke’s Letter Concerning Toleration came from Louis XIV’s revocation of the Edict of Nantes, which took away the already limited rights of Protestants in France and exposed them to state persecution.

It is possible to see Locke’s arguments in favor of toleration as relating both to the epistemological views of the Essay and the political views of the Two Treatises. Relating to Locke’s epistemological views, recall from above that Locke thought the scope of human knowledge was extremely restricted. We might not be particularly good at determining what the correct religion is. There is no reason to think that those holding political power will be any better at discovering the true religion than anyone else, so they should not attempt to enforce their views on others. Instead, each individual should be allowed to pursue true beliefs as best as they are able. Little harm results from allowing others to have their own religious beliefs. Indeed, it might be beneficial to allow a plurality of beliefs because one group might end up with the correct beliefs and win others over to their side.

Relating to Locke’s political views, as expressed in the Two Treatises, Locke endorses toleration on the grounds that the enforcement of religious conformity is outside the proper scope of government. People consent to governments for the purpose of establishing social order and the rule of law. Governments should refrain from enforcing religious conformity because doing so is unnecessary and irrelevant for these ends. Indeed, attempting to enforce conformity may positively harm these ends as it will likely lead to resistance from members of prohibited religions. Locke also suggests that governments should tolerate the religious beliefs of individual citizens because enforcing religious belief is actually impossible. Acceptance of a certain religion is an inward act, a function of one’s beliefs. But governments are designed to control people’s actions. So governments are, in many ways, ill-equipped to enforce the adoption of a particular religion because individual people have an almost perfect control of their own thoughts.

While Locke’s views on toleration were very progressive for the time and while his views do have an affinity with our contemporary consensus on the value of religious toleration it is important to recognize that Locke did place some severe limits on toleration. He did not think that we should tolerate the intolerant, those who would seek to forcibly impose their religious views on others. Similarly, any religious group who posed a threat to political stability or public safety should not be tolerated. Importantly, Locke included Roman Catholics in this group. On his view, Catholics had a fundamental allegiance to the Pope, a foreign prince who did not recognize the sovereignty of English law. This made Catholics a threat to civil government and peace. Finally, Locke also believed that atheists should not be tolerated. Because they did not believe they would be rewarded or punished for their actions in an afterlife, Locke did not think they could be trusted to behave morally or maintain their contractual obligations.

5. Theology

We have already seen that in the Essay Locke developed an account of belief according to faith and belief according to reason. Recall that an agent believes according to reason when she discovers something through the use of her natural faculties and she believes according to faith when she takes something as truth because she understands it to be a message from God. Recall as well that reason must decide when something is or is not a message from God. The goal of Locke’s The Reasonableness of Christianity is to show that it is reasonable to be a Christian. Locke argues that we do have sufficient reason to think that the central truths of Christianity were communicated to us by God through his messenger, Jesus of Nazareth.

For Locke’s project to succeed he needed to show that Jesus provided his original followers with sufficient evidence that he was a legitimate messenger from God. Given that numerous individuals in history had purported to be the recipients of divine revelation, there must be something special which set Jesus apart. Locke offers two considerations in this regard. The first is that Jesus fulfilled a number of historical predictions concerning the coming of a Messiah. The second is that Jesus performed a number of miracles which attest that he had a special relationship to God. Locke also claims that we have sufficient reason to believe that these miracles actually occurred on the basis of testimony from those who witnessed them first-hand and a reliable chain of reporting from Jesus’ time into our own. This argument leads Locke into a discussion of the types and value of testimony which many philosophers have found to be interesting in its own right.

One striking feature of The Reasonableness of Christianity is the requirement for salvation that Locke endorses. Disputes about which precise beliefs were necessary for salvation and eternal life in Heaven were at the core of much religious disagreement in Locke’s time. Different denominations and sects claimed that they, and often only they, had the correct beliefs. Locke, by contrast, argued that to be a true Christian and worthy of salvation an individual only need to believe one simple truth: that Jesus is the Messiah. Of course, Locke believed there were many other important truths in the Bible. But he thought these other truths, especially those contained in the Epistles rather than the Gospels, could be difficult to interpret and could lead to disputes and disagreement. The core tenet of Christianity, however, that Jesus is the Messiah, was a mandatory belief.

In making the requirements for Christian faith and salvation so minimal Locke was part of a growing faction in the Church of England. These individuals, often known as latitudinarians, were deliberately attempting to construct a more irenic Christianity with the goal of avoiding the conflict and controversy that previous internecine fights had produced. So Locke was hardly alone in attempting to find a set of core Christian commitments which were free of sectarian theological baggage. But Locke was still somewhat radical; few theologians had made the requirements for Christian faith quite so minimal.

6. Education

Locke was regarded by many in his time as an expert on educational matters. He taught many students at Oxford and also served as a private tutor. Locke’s correspondence shows that he was constantly asked to recommend tutors and offer pedagogical advice. Locke’s expertise led to his most important work on the subject: Some Thoughts Concerning Education. The work had its origins in a series of letters Locke wrote to Edward Clarke offering advice on the education of Clarke’s children and was first published in 1693.

Locke’s views on education were, for the time, quite forward-looking. Classical languages, usually learned through tedious exercises involving rote memorization, and corporeal punishment were two predominant features of the seventeenth century English educational system. Locke saw little use for either. Instead, he emphasized the importance of teaching practical knowledge. He recognized that children learn best when they are engaged with the subject matter. Locke also foreshadowed some contemporary pedagogical views by suggesting that children should be allowed some self-direction in their course of study and should have the ability to pursue their interests.

Locke believed it was important to take great care in educating the young. He recognized that habits and prejudices formed in youth could be very hard to break in later life. Thus, much of Some Thoughts Concerning Education focuses on morality and the best ways to inculcate virtue and industry. Locke rejected authoritarian approaches. Instead, he favored methods that would help children to understand the difference between right and wrong and to cultivate a moral sense of their own.

7. Locke’s Influence

The Essay was quickly recognized as an important philosophical contribution both by its admirers and by its critics. Before long it had been incorporated into the curriculum at Oxford and Cambridge and its translation into both Latin and French garnered it an audience on the Continent as well. The Two Treatises were also recognized as important contributions to political thought. While the work had some success in England among those favorably disposed to the Glorious Revolution, its primary impact was abroad. During the American Revolution (and to a lesser extent, during the French Revolution) Locke’s views were often appealed to by those seeking to establish more representative forms of government.

Related to this last point, Locke came to be seen, alongside his friend Newton, as an embodiment of Enlightenment values and ideals. Newtonian science would lay bare the workings of nature and lead to important technological advances. Lockean philosophy would lay bare the workings of men’s minds and lead to important reforms in law and government. Voltaire played an instrumental role in shaping this legacy for Locke and worked hard to publicize Locke’s views on reason, toleration, and limited government. Locke also came to be seen as an inspiration for the Deist movement. Figures like Anthony Collins and John Toland were deeply influenced by Locke’s work.

Locke is often recognized as the founder of British Empiricism and it is true that Locke laid the foundation for much of English-language philosophy in the 18^th and early 19^th centuries. But those who followed in his footsteps were not unquestioning followers. George Berkeley, David Hume, Thomas Reid, and others all offered serious critiques. In recent decades, readers have attempted to offer more charitable reconstructions of Locke’s philosophy. Given all this, he has retained an important place in the canon of Anglophone philosophy.

8. References and Further Reading

a. Locke’s Works

Laslett, P. [ed.] 1988. Two Treatises of Government. Cambridge: Cambridge University Press.
Locke, J. 1823. The Works of John Locke. London: Printed for T. Tegg (10 volumes).
Locke, J. The Clarendon Edition of the Works of John Locke, Oxford University Press, 2015. This edition includes the following volumes:
Nidditch, P. [ed.] 1975. An Essay Concerning Human Understanding.
Nidditch, P. and G.A.J. Rogers [eds.] 1990. Drafts for the Essay Concerning Human Understanding.
Yolton, J.W. and J.S. Yolton. [eds.] 1989. Some Thoughts Concerning Education.
Higgins-Biddle, J.C. [ed.] 1999. The Reasonableness of Christianity.
Milton, J.R. and P. Milton. [eds.] 2006. An Essay Concerning Toleration.
de Beer, E.S. [ed.] 1976-1989. The Correspondence of John Locke. (8 volumes).
von Leyden, W. [ed.] 1954. Essays on the Law of Nature. Oxford: Clarendon Press.

b. Recommended Reading

The following are recommendations for further reading on Locke. Each work has a brief statement indicating the contents

Anstey, P. 2011. John Locke & Natural Philosophy. Oxford: Oxford University Press.
A thorough examination of Locke’s scientific and medical thinking.
Ayers, M. 1993. Locke: Epistemology and Ontology. New York: Routledge.
A classic in Locke studies. Explores philosophical topics in the Essay and discusses Locke’s project as a whole. One volume on epistemology and one on metaphysics.
Chappell, V. 1994. The Cambridge Companion to Locke. Cambridge: Cambridge University Press.
A series of essays focusing on all aspects of Locke’s thought.
LoLordo, A. 2012. Locke’s Moral Man. Oxford: Oxford University Press.
An exploration and discussion of themes at the intersection of Locke’s moral and political thought. Focuses particularly on agency, personhood, and rationality.
Lowe, E.J. 2005. Locke. New York: Routledge.
An introductory overview of Locke’s philosophical and political thought.
Mackie, J.L. 1976. Problems from Locke. Oxford: Oxford University Press.
Uses Locke’s work to raise and discuss a number of philosophical issues and puzzles.
Newman, L. 2007. The Cambridge Companion to Locke’s Essay Concerning Human Understanding. Cambridge: Cambridge University Press.
A series of essays focusing on specific issues in Locke’s Essay.
Pyle, A.J. 2013. Locke. London: Polity.
An excellent and brief introduction to Locke’s thought and historical context. A very good place to start for beginners.
Rickless, S. 2014. Locke. Malden, MA: Blackwell.
An introductory overview of Locke’s philosophical and political thought.
Stuart, M. 2013. Locke’s Metaphysics. Oxford: Oxford University Press.
An in-depth treatment of metaphysical issues and problems in the Essay.
Waldron, J. 2002. God, Locke, and Equality: Christian Foundation of Locke’s Political Thought. Cambridge: Cambridge University Press.
An examination of some key issues in Locke’s political thought.
Woolhouse, R. 2009. Locke: A Biography. Cambridge: Cambridge University Press.
The best and most recent biography of Locke’s life.

Author Information

Patrick J. Connolly
Email: pac317@lehigh.edu
Lehigh University
U. S. A.

Act and Rule Utilitarianism

Utilitarianism is one of the best known and most influential moral theories. Like other forms of consequentialism, its core idea is that whether actions are morally right or wrong depends on their effects. More specifically, the only effects of actions that are relevant are the good and bad results that they produce. A key point in this article concerns the distinction between individual actions and types of actions. Act utilitarians focus on the effects of individual actions (such as John Wilkes Booth’s assassination of Abraham Lincoln) while rule utilitarians focus on the effects of types of actions (such as killing or stealing).

Utilitarians believe that the purpose of morality is to make life better by increasing the amount of good things (such as pleasure and happiness) in the world and decreasing the amount of bad things (such as pain and unhappiness). They reject moral codes or systems that consist of commands or taboos that are based on customs, traditions, or orders given by leaders or supernatural beings. Instead, utilitarians think that what makes a morality be true or justifiable is its positive contribution to human (and perhaps non-human) beings.

The most important classical utilitarians are Jeremy Bentham (1748-1832) and John Stuart Mill (1806-1873). Bentham and Mill were both important theorists and social reformers. Their theory has had a major impact both on philosophical work in moral theory and on approaches to economic, political, and social policy. Although utilitarianism has always had many critics, there are many 21^st century thinkers that support it.

The task of determining whether utilitarianism is the correct moral theory is complicated because there are different versions of the theory, and its supporters disagree about which version is correct. This article focuses on perhaps the most important dividing line among utilitarians, the clash between act utilitarianism and rule utilitarianism. After a brief overall explanation of utilitarianism, the article explains both act utilitarianism and rule utilitarianism, the main differences between them, and some of the key arguments for and against each view.

Utilitarianism: Overall View
How Act Utilitarianism and Rule Utilitarianism Differ
Act Utilitarianism: Pros and Cons
Rule Utilitarianism: Pros and Cons
1. Arguments for Rule Utilitarianism
  1. Why Rule Utilitarianism Maximizes Utility
  2. Rule Utilitarianism Avoids the Criticisms of Act Utilitarianism
2. Arguments against Rule Utilitarianism
Conclusion
References and Further Reading

1. Utilitarianism: Overall View

Utilitarianism is a philosophical view or theory about how we should evaluate a wide range of things that involve choices that people face. Among the things that can be evaluated are actions, laws, policies, character traits, and moral codes. Utilitarianism is a form of consequentialism because it rests on the idea that it is the consequences or results of actions, laws, policies, etc. that determine whether they are good or bad, right or wrong. In general, whatever is being evaluated, we ought to choose the one that will produce the best overall results. In the language of utilitarians, we should choose the option that “maximizes utility,” i.e. that action or policy that produces the largest amount of good.

Utilitarianism appears to be a simple theory because it consists of only one evaluative principle: Do what produces the best consequences. In fact, however, the theory is complex because we cannot understand that single principle unless we know (at least) three things: a) what things are good and bad; b) whose good (i.e. which individuals or groups) we should aim to maximize; and c) whether actions, policies, etc. are made right or wrong by their actual consequences (the results that our actions actually produce) or by their foreseeable consequences (the results that we predict will occur based on the evidence that we have).

a. What is Good?

Jeremy Bentham answered this question by adopting the view called hedonism. According to hedonism, the only thing that is good in itself is pleasure (or happiness). Hedonists do not deny that many different kinds of things can be good, including food, friends, freedom, and many other things, but hedonists see these as “instrumental” goods that are valuable only because they play a causal role in producing pleasure or happiness. Pleasure and happiness, however, are “intrinsic” goods, meaning that they are good in themselves and not because they produce some further valuable thing. Likewise, on the negative side, a lack of food, friends, or freedom is instrumentally bad because it produces pain, suffering, and unhappiness; but pain, suffering and unhappiness are intrinsically bad, i.e. bad in themselves and not because they produce some further bad thing.

Many thinkers have rejected hedonism because pleasure and pain are sensations that we feel, claiming that many important goods are not types of feelings. Being healthy or honest or having knowledge, for example, are thought by some people to be intrinsic goods that are not types of feelings. (People who think there are many such goods are called pluralists or“objective list” theorists.) Other thinkers see desires or preferences as the basis of value; whatever a person desires is valuable to that person. If desires conflict, then the things most strongly preferred are identified as good.

In this article, the term “well-being” will generally be used to identify what utilitarians see as good or valuable in itself. All utilitarians agree that things are valuable because they tend to produce well-being or diminish ill-being, but this idea is understood differently by hedonists, objective list theorists, and preference/desire theorists. This debate will not be further discussed in this article.

b. Whose Well-being?

Utilitarian reasoning can be used for many different purposes. It can be used both for moral reasoning and for any type of rational decision-making. In addition to applying in different contexts, it can also be used for deliberations about the interests of different persons and groups.

i. Individual Self-interest

(See egoism.) When individuals are deciding what to do for themselves alone, they consider only their own utility. For example, if you are choosing ice cream for yourself, the utilitarian view is that you should choose the flavor that will give you the most pleasure. If you enjoy chocolate but hate vanilla, you should choose chocolate for the pleasure it will bring and avoid vanilla because it will bring displeasure. In addition, if you enjoy both chocolate and strawberry, you should predict which flavor will bring you more pleasure and choose whichever one will do that.

In this case, because utilitarian reasoning is being applied to a decision about which action is best for an individual person, it focuses only on how the various possible choices will affect this single person’s interest and does not consider the interests of other people.

ii. Groups

People often need to judge what is best not only for themselves or other individuals but alsowhat is best for groups, such as friends, families, religious groups, one’s country, etc. Because Bentham and other utilitarians were interested in political groups and public policies, they often focused on discovering which actions and policies would maximize the well-being of the relevant group. Their method for determining the well-being of a group involved adding up the benefits and losses that members of the group would experience as a result of adopting one action or policy. The well-being of the group is simply the sum total of the interests of the all of its members.

To illustrate this method, suppose that you are buying ice cream for a party that ten people will attend. Your only flavor options are chocolate and vanilla, and some of the people attending like chocolate while others like vanilla. As a utilitarian, you should choose the flavor that will result in the most pleasure for the group as a whole. If seven like chocolate and three like vanilla and if all of them get the same amount of pleasure from the flavor they like, then you should choose chocolate. This will yield what Bentham, in a famous phrase, called “the greatest happiness for the greatest number.”

An important point in this case is that you should choose chocolate even if you are one of the three people who enjoy vanilla more than chocolate. The utilitarian method requires you to count everyone’s interests equally. You may not weigh some people’s interests—including your own—more heavily than others. Similarly, if a government is choosing a policy, it should give equal consideration to the well-being of all members of the society.

iii. Everyone Affected

While there are circumstances in which the utilitarian analysis focuses on the interests of specific individuals or groups, the utilitarian moral theory requires that moral judgments be based on what Peter Singer calls the “equal consideration of interests.” Utilitarianism moral theory then, includes the important idea that when we calculate the utility of actions, laws, or policies, we must do so from an impartial perspective and not from a “partialist” perspective that favors ourselves, our friends, or others we especially care about. Bentham is often cited as the source of a famous utilitarian axiom: “every man to count for one, nobody for more than one.”

If this impartial perspective is seen as necessary for a utilitarian morality, then both self-interest and partiality to specific groups will be rejected as deviations from utilitarian morality. For example, so-called “ethical egoism,” which says that morality requires people to promote their own interest, would be rejected either as a false morality or as not a morality at all. While a utilitarian method for determining what people’s interests are may show that it is rational for people to maximize their own well-being or the well-being of groups that they favor, utilitarian morality would reject this as a criterion for determining what is morally right or wrong.

c. Actual Consequences or Foreseeable Consequences?

Utilitarians disagree about whether judgments of right and wrong should be based on the actual consequences of actions or their foreseeable consequences. This issue arises when the actual effects of actions differ from what we expected. J. J. C. Smart (49) explains this difference by imagining the action of a person who, in 1938,saves someone from drowning. While we generally regard saving a drowning person as the right thing to do and praise people for such actions, in Smart’s imagined example, the person saved from drowning turns out to be Adolf Hitler. Had Hitler drowned, millions of other people might have been saved from suffering and death between 1938 and 1945. If utilitarianism evaluates the rescuer’s action based on its actual consequences, then the rescuer did the wrong thing. If, however, utilitarians judge the rescuer’s action by its foreseeable consequences (i.e. the ones the rescuer could reasonably predict), then the rescuer—who could not predict the negative effects of saving the person from drowning—did the right thing.

One reason for adopting foreseeable consequence utilitarianism is that it seems unfair to say that the rescuer acted wrongly because the rescuer could not foresee the future bad effects of saving the drowning person. In response, actual consequence utilitarians reply that there is a difference between evaluating an action and evaluating the person who did the action. In their view, while the rescuer’s action was wrong, it would be a mistake to blame or criticize the rescuer because the bad results of his act were unforeseeable. They stress the difference between evaluating actions and evaluating the people who perform them.

Foreseeable consequence utilitarians accept the distinction between evaluating actions and evaluating the people who carry them out, but they see no reason to make the moral rightness or wrongness of actions depend on facts that might be unknowable. For them, what is right or wrong for a person to do depends on what is knowable by a person at a time. For this reason, they claim that the person who rescued Hitler did the right thing, even though the actual consequences were unfortunate.

Another way to describe the actual vs. foreseeable consequence dispute is to contrast two thoughts. One (the actual consequence view) says that to act rightly is to do whatever produces the best consequences. The second view says that a person acts rightly by doing the action that has the highest level of “expected utility.” The expected utility is a combination of the good (or bad) effects that one predicts will result from an action and the probability of those effects occurring. In the case of the rescuer, the expected positive utility is high because the probability that saving a drowning person will lead to the deaths of millions of other people is extremely low, and thus can be ignored in deliberations about whether to save the drowning person.

What this shows is that actual consequence and foreseeable consequence utilitarians have different views about the nature of utilitarian theory. Foreseeable consequence utilitarians understand the theory as a decision-making procedure while actual consequence utilitarians understand it as a criterion of right and wrong. Foreseeable consequence utilitarians claim that the action with the highest expected utility is both the best thing to do based on current evidence and the right action. Actual consequence utilitarians might agree that the option with the highest expected utility is the best thing to do but they claim that it could still turn out to be the wrong action. This would occur if unforeseen bad consequences reveal that the option chosen did not have the best results and thus was the wrong thing to do.

2. How Act Utilitarianism and Rule Utilitarianism Differ

Both act utilitarians and rule utilitarians agree that our overall aim in evaluating actions should be to create the best results possible, but they differ about how to do that.

Act utilitarians believe that whenever we are deciding what to do, we should perform the action that will create the greatest net utility. In their view, the principle of utility—do whatever will produce the best overall results—should be applied on a case by case basis. The right action in any situation is the one that yields more utility (i.e. creates more well-being) than other available actions.

Rule utilitarians adopt a two part view that stresses the importance of moral rules. According to rule utilitarians, a) a specific action is morally justified if it conforms to a justified moral rule; and b) a moral rule is justified if its inclusion into our moral code would create more utility than other possible rules (or no rule at all). According to this perspective, we should judge the morality of individual actions by reference to general moral rules, and we should judge particular moral rules by seeing whether their acceptance into our moral code would produce more well-being than other possible rules.

The key difference between act and rule utilitarianism is that act utilitarians apply the utilitarian principle directly to the evaluation of individual actions while rule utilitarians apply the utilitarian principle directly to the evaluation of rules and then evaluate individual actions by seeing if they obey or disobey those rules whose acceptance will produce the most utility.

The contrast between act and rule utilitarianism, though previously noted by some philosophers, was not sharply drawn until the late 1950s when Richard Brandt introduced this terminology. (Other terms that have been used to make this contrast are “direct” and “extreme” for act utilitarianism, and “indirect” and “restricted” for rule utilitarianism.) Because the contrast had not been sharply drawn, earlier utilitarians like Bentham and Mill sometimes apply the principle of utility to actions and sometimes apply it to the choice of rules for evaluating actions. This has led to scholarly debates about whether the classical utilitarians supported act utilitarians or rule utilitarians or some combination of these views. One indication that Mill accepted rule utilitarianism is his claim that direct appeal to the principle of utility is made only when “secondary principles” (i.e. rules) conflict with one another. In such cases, the “maximize utility” principle is used to resolve the conflict and determine the right action to take. [Mill, Utilitarianism, Chapter 2]

3. Act Utilitarianism: Pros and Cons

Act utilitarianism is often seen as the most natural interpretation of the utilitarian ideal. If our aim is always to produce the best results, it seems plausible to think that in each case of deciding what is the right thing to do, we should consider the available options (i.e. what actions could be performed), predict their outcomes, and approve of the action that will produce the most good.

a. Arguments for Act Utilitarianism

i. Why Act utilitarianism Maximizes Utility

If every action that we carry out yields more utility than any other action available to us, then the total utility of all our actions will be the highest possible level of utility that we could bring about. In other words, we can maximize the overall utility that is within our power to bring about by maximizing the utility of each individual action that we perform. If we sometimes choose actions that produce less utility than is possible, the total utility of our actions will be less than the amount of goodness that we could have produced. For that reason, act utilitarians argue, we should apply the utilitarian principle to individual acts and not to classes of similar actions.

ii. Why Act Utilitarianism is Better than Traditional, Rule-based Moralities

Traditional moral codes often consist of sets of rules regarding types of actions. The Ten Commandments, for example, focus on types of actions, telling us not to kill, steal, bear false witness, commit adultery, or covet the things that belong to others. Although the Biblical sources permit exceptions to these rules (such as killing in self-defense and punishing people for their sins), the form of the commandments is absolute. They tell us “thou shalt not do x” rather than saying “thou shalt not do x except in circumstances a, b, or c.”

In fact, both customary and philosophical moral codes often seem to consist of absolute rules. The philosopher Immanuel Kant is famous for the view that lying is always wrong, even in cases where one might save a life by lying. According to Kant, if A is trying to murder B and A asks you where B is, it would be wrong for you to lie to A, even if lying would save B’s life (Kant).

Act utilitarians reject rigid rule-based moralities that identify whole classes of actions as right or wrong. They argue that it is a mistake to treat whole classes of actions as right or wrong because the effects of actions differ when they are done in different contexts and morality must focus on the likely effects of individual actions. It is these effects that determine whether they are right or wrong in specific cases. Act utilitarians acknowledge that it may be useful to have moral rules that are “rules of thumb”—i.e., rules that describe what is generally right or wrong, but they insist that whenever people can do more good by violating a rule rather than obeying it, they should violate the rule. They see no reason to obey a rule when more well-being can be achieved by violating it.

iii. Why Act Utilitarianism Makes Moral Judgments Objectively True

One advantage of act utilitarianism is that it shows how moral questions can have objectively true answers. Often, people believe that morality is subjective and depends only on people’s desires or sincere beliefs. Act utilitarianism, however, provides a method for showing which moral beliefs are true and which are false.

Once we embrace the act utilitarian perspective, then every decision about how we should act will depend on the actual or foreseeable consequences of the available options. If we can predict the amount of utility/good results that will be produced by various possible actions, then we can know which ones are right or wrong.

Although some people doubt that we can measure amounts of well-being, we in fact do this all the time. If two people are suffering and we have enough medication for only one, we can often tell that one person is experiencing mild discomfort while the other is in severe pain. Based on this judgment, we will be confident that we can do more good by giving the medication to the person suffering extreme pain. Although this case is very simple, it shows that we can have objectively true answers to questions about what actions are morally right or wrong.

Jeremy Bentham provided a model for this type of decision making in his description of a “hedonic calculus,” which was meant to show what factors should be used to determine amounts of pleasure and happiness, pain and suffering. Using this information, Bentham thought, would allow for making correct judgments both in individual cases and in choices about government actions and policies.

b. Arguments against Act Utilitarianism

i. The “Wrong Answers” Objection

The most common argument against act utilitarianism is that it gives the wrong answers to moral questions. Critics say that it permits various actions that everyone knows are morally wrong. The following cases are among the commonly cited examples:

If a judge can prevent riots that will cause many deaths only by convicting an innocent person of a crime and imposing a severe punishment on that person, act utilitarianism implies that the judge should convict and punish the innocent person. (See Rawls and also Punishment.)
If a doctor can save five people from death by killing one healthy person and using that person’s organs for life-saving transplants, then act utilitarianism implies that the doctor should kill the one person to save five.
If a person makes a promise but breaking the promise will allow that person to perform an action that creates just slightly more well-being than keeping the promise will, then act utilitarianism implies that the promise should be broken. (See Ross)

The general form of each of these arguments is the same. In each case, act utilitarianism implies that a certain act is morally permissible or required. Yet, each of the judgments that flow from act utilitarianism conflicts with widespread, deeply held moral beliefs. Because act utilitarianism approves of actions that most people see as obviously morally wrong, we can know that it is a false moral theory.

ii. The “Undermining Trust” Objection

Although act utilitarians criticize traditional moral rules for being too rigid, critics charge that utilitarians ignore the fact that this alleged rigidity is the basis for trust between people. If, in cases like the ones described above, judges, doctors, and promise-makers are committed to doing whatever maximizes well-being, then no one will be able to trust that judges will act according to the law, that doctors will not use the organs of one patient to benefit others, and that promise-makers will keep their promises. More generally, if everyone believed that morality permitted lying, promise-breaking, cheating, and violating the law whenever doing so led to good results, then no one could trust other people to obey these rules. As a result, in an act utilitarian society, we could not believe what others say, could not rely on them to keep promises, and in general could not count on people to act in accord with important moral rules. As a result, people’s behavior would lack the kind of predictability and consistency that are required to sustain trust and social stability.

iii. Partiality and the “Too Demanding” Objection

Critics also attack utilitarianism’s commitment to impartiality and the equal consideration of interests. An implication of this commitment is that whenever people want to buy something for themselves or for a friend or family member, they must first determine whether they could create more well-being by donating their money to help unknown strangers who are seriously ill or impoverished. If more good can be done by helping strangers than by purchasing things for oneself or people one personally cares about, then act utilitarianism requires us to use the money to help strangers in need. Why? Because act utilitarianism requires impartiality and the equal consideration of all people’s needs and interests.

Almost everyone, however, believes that we have special moral duties to people who are near and dear to us. As a result, most people would reject the notion that morality requires us to treat people we love and care about no differently from people who are perfect strangers as absurd.

This issue is not merely a hypothetical case. In a famous article, Peter Singer defends the view that people living in affluent countries should not purchase luxury items for themselves when the world is full of impoverished people. According to Singer, a person should keep donating money to people in dire need until the donor reaches the point where giving to others generates more harm to the donor than the good that is generated for the recipients.

Critics claim that the argument for using our money to help impoverished strangers rather than benefiting ourselves and people we care about only proves one thing—that act utilitarianism is false. There are two reasons that show why it is false. First, it fails to recognize the moral legitimacy of giving special preferences to ourselves and people that we know and care about. Second, since pretty much everyone is strongly motivated to act on behalf of themselves and people they care about, a morality that forbids this and requires equal consideration of strangers is much too demanding. It asks more than can reasonably be expected of people.

c. Possible Responses to Criticisms of Act Utilitarianism

There are two ways in which act utilitarians can defend their view against these criticisms. First, they can argue that critics misinterpret act utilitarianism and mistakenly claim that it is committed to supporting the wrong answer to various moral questions. This reply agrees that the “wrong answers” are genuinely wrong, but it denies that the “wrong answers” maximize utility. Because they do not maximize utility, these wrong answers would not be supported by act utilitarians and therefore, do nothing to weaken their theory.

Second, act utilitarians can take a different approach by agreeing with the critics that act utilitarianism supports the views that critics label “wrong answers.” Act utilitarians may reply that all this shows is that the views supported by act utilitarianism conflict with common sense morality. Unless critics can prove that common sense moral beliefs are correct the criticisms have no force. Act utilitarians claim that their theory provides good reasons to reject many ordinary moral claims and to replace them with moral views that are based on the effects of actions.

People who are convinced by the criticisms of act utilitarianism may decide to reject utilitarianism entirely and adopt a different type of moral theory. This judgment, however, would be sound only if act utilitarianism were the only type of utilitarian theory. If there are other versions of utilitarianism that do not have act utilitarianism’s flaws, then one may accept the criticisms of act utilitarianism without forsaking utilitarianism entirely. This is what defenders of rule utilitarianism claim. They argue that rule utilitarianism retains the virtues of a utilitarian moral theory but without the flaws of the act utilitarian version.

4. Rule Utilitarianism: Pros and Cons

Unlike act utilitarians, who try to maximize overall utility by applying the utilitarian principle to individual acts, rule utilitarians believe that we can maximize utility only by setting up a moral code that contains rules. The correct moral rules are those whose inclusion in our moral code will produce better results (more well-being) than other possible rules. Once we determine what these rules are, we can then judge individual actions by seeing if they conform to these rules. The principle of utility, then, is used to evaluate rules and is not applied directly to individual actions. Once the rules are determined, compliance with these rules provides the standard for evaluating individual actions.

a. Arguments for Rule Utilitarianism

i. Why Rule Utilitarianism Maximizes Utility

Rule utilitarianism sounds paradoxical. It says that we can produce more beneficial results by following rules than by always performing individual actions whose results are as beneficial as possible. This suggests that we should not always perform individual actions that maximize utility. How could this be something that a utilitarian would support?

In spite of this paradox, rule utilitarianism possesses its own appeal, and its focus on moral rules can sound quite plausible. The rule utilitarian approach to morality can be illustrated by considering the rules of the road. If we are devising a code for drivers, we can adopt either open-ended rules like “drive safely” or specific rules like “stop at red lights,” “do not travel more than 30 miles per hour in residential areas,” “do not drive when drunk,” etc. The rule “drive safely”, like the act utilitarian principle, is a very general rule that leaves it up to individuals to determine what the best way to drive in each circumstance is. More specific rules that require stopping at lights, forbid going faster than 30 miles per hour, or prohibit driving while drunk do not give drivers the discretion to judge what is best to do. They simply tell drivers what to do or not do while driving.

The reason why a more rigid rule-based system leads to greater overall utility is that people are notoriously bad at judging what is the best thing to do when they are driving a car. Having specific rules maximizes utility by limiting drivers’ discretionary judgments and thereby decreasing the ways in which drivers may endanger themselves and others.

A rule utilitarian can illustrate this by considering the difference between stop signs and yield signs. Stop signs forbid drivers to go through an intersection without stopping, even if the driver sees that there are no cars approaching and thus no danger in not stopping. A yield sign permits drivers to go through without stopping unless they judge that approaching cars make it dangerous to drive through the intersection. The key difference between these signs is the amount of discretion that they give to the driver.

The stop sign is like the rule utilitarian approach. It tells drivers to stop and does not allow them to calculate whether it would be better to stop or not. The yield sign is like act utilitarianism. It permits drivers to decide whether there is a need to stop. Act utilitarians see the stop sign as too rigid because it requires drivers to stop even when nothing bad will be prevented. The result, they say, is a loss of utility each time a driver stops at a stop sign when there is no danger from oncoming cars.

Rule utilitarians will reply that they would reject the stop sign method a) if people could be counted on to drive carefully and b) if traffic accidents only caused limited amounts of harm. But, they say, neither of these is true. Because people often drive too fast and are inattentive while driving (because they are, for example, talking, texting, listening to music, or tired), we cannot count on people to make good utilitarian judgments about how to drive safely. In addition, the costs (i.e. the disutility) of accidents can be very high. Accident victims (including drivers) may be killed, injured, or disabled for life. For these reasons, rule utilitarians support the use of stop signs and other non-discretionary rules under some circumstances. Overall these rules generate greater utility because they prevent more disutility (from accidents) than they create (from “unnecessary” stops).

Rule utilitarians generalize from this type of case and claim that our knowledge of human behavior shows that there are many cases in which general rules or practices are more likely to promote good effects than simply telling people to do whatever they think is best in each individual case.

This does not mean that rule utilitarians always support rigid rules without exceptions. Some rules can identify types of situations in which the prohibition is over-ridden. In emergency medical situations, for example, a driver may justifiably go through a red light or stop sign based on the driver’s own assessment that a) this can be done safely and b) the situation is one in which even a short delay might cause dire harms. So the correct rule need not be “never go through a stop sign” but rather can be something like “never go through a stop sign except in cases that have properties a and b.” In addition, there will remain many things about driving or other behavior that can be left to people’s discretion. The rules of the road do not tell drivers when to drive or what their destination should be for example.

Overall then, rule utilitarian can allow departures from rules and will leave many choices up to individuals. In such cases, people may act in the manner that looks like the approach supported by act utilitarians. Nonetheless, these discretionary actions are permitted because having a rule in these cases does not maximize utility or because the best rule may impose some constraints on how people act while still permitting a lot of discretion in deciding what to do.

ii. Rule Utilitarianism Avoids the Criticisms of Act Utilitarianism

As discussed earlier, critics of act utilitarianism raise three strong objections against it. According to these critics, act utilitarianism a) approves of actions that are clearly wrong; b) undermines trust among people, and c) is too demanding because it requires people to make excessive levels of sacrifice. Rule utilitarians tend to agree with these criticisms of act utilitarianism and try to explain why rule utilitarianism is not open to any of these objections.

1. Judges, Doctors, and Promise-makers

Critics of act utilitarianism claim that it allows judges to sentence innocent people to severe punishments when doing so will maximize utility, allows doctors to kill healthy patients if by doing so, they can use the organs of one person to save more lives, and allows people to break promises if that will create slightly more benefits than keeping the promise.

Rule utilitarians say that they can avoid all these charges because they do not evaluate individual actions separately but instead support rules whose acceptance maximizes utility. To see the difference that their focus on rules makes, consider which rule would maximize utility: a) a rule that allows medical doctors to kill healthy patients so that they can use their organs for transplants that will save a larger number of patients who would die without these organs; or b) a rule that forbids doctors to remove the organs of healthy patients in order to benefit other patients.

Although more good may be done by killing the healthy patient in an individual case, it is unlikely that more overall good will be done by having a rule that allows this practice. If a rule were adopted that allows doctors to kill healthy patients when this will save more lives, the result would be that many people would not go to doctors at all. A rule utilitarian evaluation will take account of the fact that the benefits of medical treatment would be greatly diminished because people would no longer trust doctors. People who seek medical treatment must have a high degree of trust in doctors. If they had to worry that doctors might use their organs to help other patients, they would not, for example, allow doctors to anesthetize them for surgery because the resulting loss of consciousness would make them completely vulnerable and unable to defend themselves. Thus, the rule that allows doctors to kill one patient to save five would not maximize utility.

The same reasoning applies equally to the case of the judge. In order to have a criminal justice system that protects people from being harmed by others, we authorize judges and other officials to impose serious punishments on people who are convicted of crimes. The purpose of this is to provide overall security to people in their jurisdiction, but this requires that criminal justice officials only have the authority to impose arrest and imprisonment on people who are actually believed to be guilty. They do not have the authority to do whatever they think will lead to the best results in particular cases. Whatever they do must be constrained by rules that limit their power. Act utilitarians may sometimes support the intentional punishment of innocent people, but rule utilitarians will understand the risks involved and will oppose a practice that allows it.

Rule utilitarians offer a similar analysis of the promise keeping case. They explain that in general, we want people to keep their promises even in some cases in which doing so may lead to less utility than breaking the promise. The reason for this is that the practice of promise-keeping is a very valuable. It enables people to have a wide range of cooperative relationships by generating confidence that other people will do what they promise to do. If we knew that people would fail to keep promises whenever some option arises that leads to more utility, then we could not trust people who make promises to us to carry them through. We would always have to worry that some better option (one that act utilitarians would favor) might emerge, leading to the breaking of the person’s promise to us.

In each of these cases then, rule utilitarians can agree with the critics of act utilitarianism that it is wrong for doctors, judges, and promise-makers to do case by case evaluations of whether they should harm their patients, convict and punish innocent people, and break promises. The rule utilitarian approach stresses the value of general rules and practices, and shows why compliance with rules often maximizes overall utility even if in some individual cases, it requires doing what produces less utility.

2. Maintaining vs. Undermining Trust

Rule utilitarians see the social impact of a rule-based morality as one of the key virtues of their theory. The three cases just discussed show why act utilitarianism undermines trust but rule utilitarianism does not. Fundamentally, in the cases of doctors, judges, and promise-keepers, it is trust that is at stake. Being able to trust other people is extremely important to our well-being. Part of trusting people involves being able to predict what they will and won’t do. Because act utilitarians are committed to a case by case evaluation method, the adoption of their view would make people’s actions much less predictable. As a result, people would be less likely to see other people as reliable and trustworthy. Rule utilitarianism does not have this problem because it is committed to rules, and these rules generate positive “expectation effects” that give us a basis for knowing how other people are likely to behave.

While rule utilitarians do not deny that there are people who are not trustworthy, they can claim that their moral code generally condemns violations of trust as wrongful acts. The problem with act utilitarians is that they support a moral view that has the effect of undermining trust and that sacrifices the good effects of a moral code that supports and encourages trustworthiness.

3. Impartiality and the Problem of Over-Demandingness

Rule utilitarians believe that their view is also immune to the criticism that act utilitarianism is too demanding. In addition, while the act utilitarian commitment to impartiality undermines the moral relevance of personal relations, rule utilitarians claim that their view is not open to this criticism. They claim that rule utilitarianism allows for partiality toward ourselves and others with whom we share personal relationships. Moreover, they say, rule utilitarianism can recognize justifiable partiality to some people without rejecting the commitment to impartiality that is central to the utilitarian tradition.

How can rule utilitarianism do this? How can it be an impartial moral theory while also allowing partiality in people’s treatment of their friends, family, and others with whom they have a special connection?

In his defense of rule utilitarianism, Brad Hooker distinguishes two different contexts in which partiality and impartiality play a role. One involves the justification of moral rules and the other concerns the application of moral rules. Justifications of moral rules, he claims, must be strictly impartial. When we ask whether a rule should be adopted, it is essential to consider the impact of the rule on all people and to weigh the interests of everyone equally.

The second context concerns the content of the rules and how they are applied in actual cases. Rule utilitarians argue that a rule utilitarian moral code will allow partiality to play a role in determining what morality requires, forbids, or allows us to do. As an example, consider a moral rule parents have a special duty to care for their own children. (See Parental Rights and Obligations.) This is a partialist rule because it not only allows but actually requires parents to devote more time, energy, and other resources to their own children than to others. While it does not forbid devoting resources to other people’s children, it allows people to give to their own. While the content of this rule is not impartial, rule utilitarians believe it can be impartially justified. Partiality toward children can be justified for several reasons. Caring for children is a demanding activity. Children need the special attention of adults to develop physically, emotionally, and cognitively. Because children’s needs vary, knowledge of particular children’s needs is necessary to benefit them. For these reasons, it is plausible to believe that children’s well-being can best be promoted by a division of labor that requires particular parents (or other caretakers) to focus primarily on caring for specific children rather than trying to take care of all children. It is not possible for absentee parents or strangers to provide individual children with all that they need. Therefore, we can maximize the overall well-being of children as a class by designating certain people as the caretakers for specific children. For these reasons, partiality toward specific children can be impartially justified.

Similar “division of labor” arguments can be used to provide impartial justifications of other partialist rules and practices. Teachers, for example have special duties to students in their own classes and have no duty to educate all students. Similarly, public officials can and should be partial to people in the jurisdiction in which they work. If the overall aim is to maximize the well-being of all people in all cities, for example, then we are likely to get better results by having individuals who know and understand particular cities focus on them while other people focus on other cities.

Based on examples like these, rule utilitarians claim that their view, unlike act utilitarianism, avoids the problems raised about demandingness and partiality. Being committed to impartialist justifications of moral rules does not commit them to rejecting moral rules that allow or require people to give specific others priority.

While rule utilitarians can defend partiality, their commitment to maximizing overall utility also allows them to justify limits on the degree of partiality that is morally permissible. At a minimum, rule utilitarians will support a rule that forbids parents to harm other people’s children in order to advance the interests of their own children. (It would be wrong, for example, for a parent to injure children who are running in a school race in order to increase the chances that their own children will win.) Moreover, though this is more controversial, rule utilitarians may support a rule that says that if parents are financially well-off and if their own children’s needs are fully met, these parents may have a moral duty to contribute some resources for children who are deprived of essential resources.

The key point is that while rule utilitarianism permits partiality toward some people, it can also generate rules that limit the ways in which people may act partially and it might even support a positive duty for well off people to provide assistance to strangers when the needs and interests of people to whom we are partial are fully met, when they have surplus resources that could be used to assist strangers in dire conditions, and when there are ways to channel these resources effectively to people in dire need.

b. Arguments against Rule Utilitarianism

i. The “Rule Worship” Objection

Act utilitarians criticize rule utilitarians for irrationally supporting rule-based actions in cases where more good could be done by violating the rule than obeying it. They see this as a form of “rule worship,” an irrational deference to rules that has no utilitarian justification (J. J. C. Smart).

Act utilitarians say that they recognize that rules can have value. For example, rules can provide a basis for acting when there is no time to deliberate. In addition, rules can define a default position, a justification for doing (or refraining from) a type of action as long as there is no reason for not doing it. But when people know that more good can be done by violating the rule then the default position should be over-ridden.

ii. The “Collapses into Act Utilitarianism” Objection

While the “rule worship” objection assumes that rule utilitarianism is different from act utilitarianism, some critics deny that this is the case. In their view, whatever defects act utilitarianism may have, rule utilitarianism will have the same defects. According to this criticism, although rule utilitarianism looks different from act utilitarianism, a careful examination shows that it collapses into or, as David Lyons claimed, is extensionally equivalent to act utilitarianism.

To understand this criticism, it is worth focusing on a distinction between rule utilitarianism and other non-utilitarian theories. Consider Kant’s claim that lying is always morally wrong, even when lying would save a person’s life. Many people see this view as too rigid and claim that it fails to take into account the circumstances in which a lie is being told. A more plausible rule would say “do not lie except in special circumstances that justify lying.” But what are these special circumstances? For a utilitarian, it is natural to say that the correct rule is “do not lie except when lying will generate more good than telling the truth.”

Suppose that a rule utilitarian adopts this approach and advocates a moral code that consists of a list of rules of this form. The rules would say something like “do x except when not doing x maximizes utility” and “do not do x except when doing x maximizes utility.” While this may sound plausible, it is easy to see that this version of rule utilitarianism is in fact identical with act utilitarianism. Whatever action x is, the moral requirement and the moral prohibition expressed in these rules collapses into the act utilitarian rules “do x only when not doing x maximizes utility” or “do not do x except when doing x maximizes utility.” These rules say exactly the same thing as the open-ended act utilitarian rule “Do whatever action maximizes utility.”

If rule utilitarianism is to be distinct from act utilitarianism, its supporters must find a way to formulate rules that allow exceptions to a general requirement or prohibition while not collapsing into act utilitarianism. One way to do this is to identify specific conditions under which violating a general moral requirement would be justified. Instead of saying that we can violate a general rule whenever doing so will maximize utility, the rule utilitarian code might say things like “Do not lie except to prevent severe harms to people who are not unjustifiably threatening others with severe harm.” This type of rule would prohibit lying generally, but it would permit lying to a murderer to prevent harm to the intended victims even if the lie would lead to harm to the murderer. In cases of lesser harms or deceitful acts that will benefit the liar, lying would still be prohibited, even if lying might maximize overall utility.

Rule utilitarians claim that this sort of rule is not open to the “collapses into act utilitarianism” objection. It also suggests, however, that rule utilitarians face difficult challenges in formulating utility-based rules that have a reasonable degree of flexibility built into them but are not so flexible that they collapse into act utilitarianism. In addition, although the rules that make up a moral code should be flexible enough to account for the complexities of life, they cannot be so complex that they are too difficult for people to learn and understand.

iii. Wrong Answers and Crude Concepts

Although rule utilitarians try to avoid the weaknesses attributed to act utilitarianism, critics argue that they cannot avoid these weaknesses because they do not take seriously many of our central moral concepts. As a result, they cannot support the right answers to crucial moral problems. Three prominent concepts in moral thought that critics cite are justice, rights, and desert. These moral ideas are often invoked in reasoning about morality, but critics claim that neither rule nor act utilitarianism acknowledge their importance. Instead, they focus only on the amounts of utility that actions or rules generate.

In considering the case, for example, of punishing innocent people, the best that rule utilitarians can do is to say that a rule that permits this would lead to worse results overall than a rule that permitted it. This prediction, however, is precarious. While it may be true, it may also be false, and if it is false, then utilitarians must acknowledge that intentionally punishing an innocent person could sometimes be morally justified.

Against this, critics may appeal to common sense morality to support the view that there are no circumstances in which punishing the innocent can be justified because the innocent person is a) being treated unjustly, b) has a right not to be punished for something that he or she is not guilty of, and c) does not deserve to be punished for a crime that he or she did not commit.

In responding, rule utilitarians may begin, first, with the view that they do not reject concepts like justice, rights, and desert. Instead, they accept and use these concepts but interpret them from the perspective of maximizing utility. To speak of justice, rights, and desert is to speak of rules of individual treatment that are very important, and what makes them important is their contribution to promoting overall well-being. Moreover, even people who accept these concepts as basic still need to determine whether it is always wrong to treat someone unjustly, violate their rights, or treat them in ways that they don’t deserve.

Critics object to utilitarianism by claiming that the theory justifies treating people unjustly, violating their rights, etc. This criticism only stands up if it is always wrong and thus never morally justified to treat people in these ways. Utilitarians argue that moral common sense is less absolutist than their critics acknowledge. In the case of punishment, for example, while we hope that our system of criminal justice gives people fair trials and conscientiously attempts to separate the innocent from the guilty, we know that the system is not perfect. As a result, people who are innocent are sometimes prosecuted, convicted, and punished for crimes they did not do.

This is the problem of wrongful convictions, which poses a difficult challenge to critics of utilitarianism. If we know that our system of criminal justice punishes some people unjustly and in ways they don’t deserve, we are faced with a dilemma. Either we can shut down the system and punish no one, or we can maintain the system even though we know that it will result in some innocent people being unjustly punished in ways that they do not deserve. Most people will support continuing to punish people in spite of the fact that it involves punishing some people unjustly. According to rule utilitarians, this can only be justified if a rule that permits punishments (after a fair trial, etc.) yields more overall utility than a rule that rejects punishment because it treats some people unfairly. To end the practice of punishment entirely—because it inevitably causes some injustice—is likely to result in worse consequences because it deprives society of a central means of protecting people’s well-being, including what are regarded as their rights. In the end, utilitarians say, it is justice and rights that give way when rules that approve of violations in some cases yield the greatest amount of utility.

5. Conclusion

The debate between act utilitarianism and rule utilitarianism highlights many important issues about how we should make moral judgments. Act utilitarianism stresses the specific context and the many individual features of the situations that pose moral problems, and it presents a single method for dealing with these individual cases. Rule utilitarianism stresses the recurrent features of human life and the ways in which similar needs and problems arise over and over again. From this perspective, we need rules that deal with types or classes of actions: killing, stealing, lying, cheating, taking care of our friends or family, punishing people for crimes, aiding people in need, etc. Both of these perspectives, however, agree that the main determinant of what is right or wrong is the relationship between what we do or what form our moral code takes and what is the impact of our moral perspective on the level of people’s well-being.

6. References and Further Reading

a. Classic Works

Jeremy Bentham. An Introduction to the Principles of Morals and Legislation, available in many editions, 1789.
- See Book I, chapter 1 for Bentham’s statement of what utilitarianism is; chapter IV for his method of measuring amounts of pleasure/utility; chapter V for his list of types of pleasures and pains, and chapter XIII for his application of utilitarianism to questions about criminal punishment.
John Stuart Mill. Utilitarianism, available in many editions and online, 1861.
- See especially chapter II, in which Mill tries both to clarify and defend utilitarianism. Passages at the end of chapter suggest that Mill was a rule utilitarian. In chapter V, Mill tries to show that utilitarianism is compatible with justice.
Henry Sidgwick. The Methods of Ethics, Seventh Edition, available in many editions, 1907.
- Sidgwick is known for his careful, extended analysis of utilitarian moral theory and competing views.
G. E. Moore. Principia Ethica, 1903.
- Moore criticizes aspects of Mill’s views but support a non-hedonistic form of utilitarianism.
G. E. Moore. Ethics. Oxford: Oxford University Press, 1912.
- Mostly focused on utilitarianism, this book contains a combination of act and rule utilitarian ideas.

b. More Recent Utilitarians

J. J. C. Smart. “An Outline of a System of Utilitarian Ethics” in J. J. C. Smart and Bernard Williams, Utilitarianism: For and Against. Cambridge University Press, 1973.
- Smart’s discussion combines an overview of moral theory and a defense of act utilitarianism. It is followed by Bernard Williams’, “A Critique of Utilitarianism,” a source of many important criticisms of utilitarianism.
Richard Brandt. Ethical Theory. Prentice Hall, 1959. Chapter 15.
- Brandt, who coined the terms “act” and “rule” utilitarianism, explains and criticizes act utilitarianism and tentatively proposes a version of rule utilitarianism.
Richard Brandt. Morality, Utilitarianism, and Rights. Cambridge University Press, 1992.
- Brandt developed and defended rule utilitarianism in many papers. This book contains several of them as well as works in which he applies rule utilitarian thinking to issues like rights and the ethics of war.
R. M. Hare. Moral Thinking. Oxford University Press, 1981.
- An interesting development of a form of rule utilitarianism by an influential moral theorist.
John C. Harsanyi. “Morality and the Theory of Rational Behavior.” in Social Research 44.4 (1977): 623-656. (Reprinted in Amartya Sen and Bernard Williams, eds., Utilitarianism and Beyond, Cambridge University Press, 1982).
- Harsanyi, a Nobel Prize economist, defends rule utilitarianism, connecting it to a preference theory of value and a theory of rational action.
John Rawls. “Two Concepts of Rules.” In Philosophical Review LXIV (1955), 3-32.
- Before becoming an influential critic of utilitarianism, Rawls wrote this defense of rule utilitarianism.
Brad Hooker. Ideal Code, Real World: A Rule-consequentialist Theory of Morality. Oxford University Press, 2000.
- In this 21^st century defense of rule utilitarianism, Hooker places it in the context of more recent developments in philosophy.
Peter Singer. Writings on an Ethical Life. HarperCollins, 2000.
- Singer, a prolific, widely read thinker, mostly applies a utilitarian perspective to controversial moral issues (for example, euthanasia, the treatment of non-human animals, and global poverty) rather than discussing utilitarian moral theory. This volume contains selections from his books and articles.
Peter Singer. “Famine, Affluence, and Morality” in Philosophy and Public Affairs 1 (1972), 229-43. Reprinted in Peter Singer. Writings on an Ethical Life. Harper Collins, 2000.
- This widely reprinted article, though it does not focus on utilitarianism, uses utilitarian reasoning and has sparked decades of debate about moral demandingness and moral impartiality.
Robert Goodin. Utilitarianism as a Public Philosophy. Cambridge University Press, 1995.
- In a series of essays, Goodin argues that utilitarianism is the best philosophy for public decision-making even if it fails as an ethic for personal aspects of life.
Derek Parfit. On What Matters. Oxford University Press, 1991.
- In a long, complex work, Parfit stresses the importance of Henry Sidgwick as a moral philosopher and argues that rule utilitarianism and Kantian deontology can be understood in a way that makes them compatible with one another.

c. Overviews

Tim Mulgan. Understanding Utilitarianism. Acumen, 2007.
- This is a very clear description of utilitarianism, including explanations of arguments both for and against. Chapter 2 discusses Bentham, Mill, and Sidgwick while chapter 6 focuses on act and rule utilitarianism.
Julia Driver, “The History of Utilitarianism,” Stanford Encyclopedia of Philosophy.
- This article gives a good historical account of important figures in the development of utilitarianism.
Walter Sinnott-Armstrong, “Consequentialism,” Stanford Encyclopedia of Philosophy.
- This very useful overview is relevant to utilitarianism and other forms of consequentialism.
William Shaw. Contemporary Ethics: Taking Account of Utilitarianism. Blackwell, 1999.
- Shaw provides a clear, comprehensive discussion of utilitarianism and its critics as well as defending utilitarianism.
John Troyer. The Classical Utilitarians: Bentham and Mill. Hackett, 2003.
- Troyer’s introduction to this book of selections from Mill and Bentham is clear and informative.
Ben Eggleston and Dale Miller, eds. The Cambridge Companion to Utilitarianism. Cambridge University Press, 2014.
- This collection contains sixteen essays on utilitarianism, including essays on historical figures as well as discussion of 21^st century issues, including both act and rule utilitarianism.

d. J. S. Mill and Utilitarian Moral Theory

J. O. Urmson. “The Interpretation of the Moral Philosophy of J. S. Mill,” in Philosophical Quarterly (1953) 3, 33-9.
- This article generated renewed interest in both Mill’s moral theory and rule utilitarianism.
Roger Crisp. Routledge Philosophy Guidebook to Mill on Utilitarianism. Routledge, 1997.
A clear discussion of Mill’s Utilitarianism with chapters on key topics as well as on Mill’s On Liberty and The Subjection of Women.
Henry. R. West, ed. The Blackwell Guide to Mill’s Utilitarianism. Blackwell, 2006.
- This contains the complete text of Mill’s Utilitarianism preceded by three essays on the background to Mill’s utilitarianism and followed by five interpretative essays and four focusing on contemporary issues.
Henry R. West. An Introduction to Mill’s Utilitarian Ethics. Cambridge University Press, 2004.
- A clear discussion of Mill; Chapter 4 argues that Mill is neither an act nor a rule utilitarian. Chapter 6 focuses on utilitarianism and justice.
Dale Miller. J. S. Mill. Polity Press, 2010.
- Miller, in Chapter 6, argues that Mill was a rule utilitarian.
Stephen Nathanson. “John Stuart Mill on Economic Justice and the Alleviation of Poverty,” in Journal of Social Philosophy, XLIII, no. 2.
- Drawing on Mill’s Principles of Political Economy, Nathanson claims that Mill was a rule utilitarian and provides an interpretation of Mill’s views on economic justice.
Wendy Donner, “Mill’s Utilitarianism” in John Skorupski, ed. The Cambridge Companion to Mill. Cambridge University Press, 1998, 255–92.
- A discussion of Mill’s views and some recent interpretations of them.
David Lyons. Rights, Welfare, and Mill’s Moral Theory. Oxford, 1994.
- In this series of papers, Lyons defends Mill’s view of morality against some critics, differentiates Mill’s views from both act and rule utilitarianism, and criticizes Mill’s attempt to show that utilitarianism can account for justice.

e. Critics of Utilitarianism

David Lyons. Forms and Limits of Utilitarianism. Oxford, 1965.
- Lyons argues that at least some versions of rule utilitarianism collapse into act utilitarianism.
David Lyons. “The Moral Opacity of Utilitarianism” in Brad Hooker, Elinor Mason, and Dale Miller, eds. Morality, Rules, and Consequences. Rowman and Littlefield, 2000.
- In a challenging essay, Lyons raises doubts about whether there is any coherent version of utilitarianism.
Judith Jarvis Thomson. “The Trolley Problem.” Yale Law Journal 94 (1985), 1395-1415. Reprinted in Judith Jarvis Thomson. Rights, Restitution and Risk. Edited by William Parent. Harvard University Press, 1986; Chapter 7.
- An influential rights-based discussion in which Jarvis Thomson uses hypothetical cases to show, among other things, that utilitarianism cannot explain why some actions that cause killings are permissible and others not.
Bernard Williams, “A Critique of Utilitarianism,” In J. J. C. Smart and Bernard Williams, Utilitarianism: For and Against. Cambridge University Press, 1973.
- Williams’ contribution to this debate contains arguments and examples that have played an important role in debates about utilitarianism and moral theory.

f. Collections of Essays

Michael D. Bayles, ed. Contemporary Utilitarianism. Garden City: Doubleday, 1968.
- Ten essays that debate act vs. rule utilitarianism as well as whether a form of utilitarianism is correct.
Samuel Gorovitz, ed. John Stuart Mill: Utilitarianism, With Critical Essays. Indianapolis: The Bobbs-Merrill Company, 1971.
- This includes Mill’s Utlitarianism plus a rich array of twenty-eight (pre-1970) articles interpreting, defending, and criticizing utilitarianism.
Brad Hooker, Elinor Mason, and Dale Miller, eds. Morality, Rules, and Consequences. Rowman and Littlefield, 2000.
- Thirteen essays on utilitarianism, many focused on issues concerning rule utilitarianism.
Samuel Scheffler. Consequentialism and Its Critics. Oxford, 1988.
- This contains a dozen influential articles, mostly by prominent critics of utilitarianism and other forms of consequentialism.
Amartya Sen, and Bernard Williams, eds. Utilitarianism and Beyond. Cambridge: Cambridge University Press, 1982.
- This contains fourteen articles, including essays defending utilitarianism by R. M. Hare and John Harsanyi, As the title suggests, however, most of the articles are critical of utilitarianism.

Author Information

Stephen Nathanson
Email: s.nathanson@neu.edu
Northeastern University
U. S. A.

Immanuel Kant

kant2 At the foundation of Kant’s system is the doctrine of “transcendental idealism,” which emphasizes a distinction between what we can experience (the natural, observable world) and what we cannot (“supersensible” objects such as God and the soul). Kant argued that we can only have knowledge of things we can experience. Accordingly, in answer to the question, “What can I know?” Kant replies that we can know the natural, observable world, but we cannot, however, have answers to many of the deepest questions of metaphysics.

Kant’s ethics are organized around the notion of a “categorical imperative,” which is a universal ethical principle stating that one should always respect the humanity in others, and that one should only act in accordance with rules that could hold for everyone. Kant argued that the moral law is a truth of reason, and hence that all rational creatures are bound by the same moral law. Thus in answer to the question, “What should I do?” Kant replies that we should act rationally, in accordance with a universal moral law.

Kant also argued that his ethical theory requires belief in free will, God, and the immortality of the soul. Although we cannot have knowledge of these things, reflection on the moral law leads to a justified belief in them, which amounts to a kind rational faith. Thus in answer to the question, “What may I hope?” Kant replies that we may hope that our souls are immortal and that there really is a God who designed the world in accordance with principles of justice.

In addition to these three focal points, Kant also made lasting contributions to nearly all areas of philosophy. His aesthetic theory remains influential among art critics. His theory of knowledge is required reading for many branches of analytic philosophy. The cosmopolitanism behind his political theory colors discourse about globalization and international relations. And some of his scientific contributions are even considered intellectual precursors to several ideas in contemporary cosmology.

This article presents an overview of these and other of Kant’s most important philosophical contributions. It follows standard procedures for citing Kant’s works. Passages from Critique of Pure Reason are cited by reference to page numbers in both the 1781 and 1787 editions. Thus “(A805/B833)” refers to page 805 in the 1781 edition and 833 in the 1787 edition. References to the rest of Kant’s works refer to the volume and page number of the official Deutsche Akademie editions of Kant’s works. Thus “(5:162)” refers to volume 5, page 162 of those editions.

Life
Metaphysics and Epistemology
Philosophy of Mathematics
Natural Science
1. Physics
2. Other Scientific Contributions
Moral Theory
Political Theory and Theory of Human History
Theory of Art and Beauty
Pragmatic Anthropology
References and Further Reading
1. Primary Literature
2. Secondary Literature

1. Life

Kant was born in 1724 in the Prussian city of Königsberg (now Kaliningrad in Russia). His parents – Johann Georg and Anna Regina – were pietists. Although they raised Kant in this tradition (an austere offshoot of Lutheranism that emphasized humility and divine grace), he does not appear ever to have been very sympathetic to this kind of religious devotion. As a youth, he attended the Collegium Fridericianum in Königsberg, after which he attended the University of Königsberg. Although he initially focused his studies on the classics, philosophy soon caught and held his attention. The rationalism of Gottfried Leibniz (1646-1716) and Christian Wolff (1679-1754) was most influential on him during these early years, but Kant was also introduced to Isaac Newton’s (1642-1727) writings during this time.

His mother had died in 1737, and after his father’s death in 1746 Kant left the University to work as a private tutor for several families in the countryside around the city. He returned to the University in 1754 to teach as a Privatdozent, which meant that he was paid directly by individual students, rather than by the University. He supported himself in this way until 1770. Kant published many essays and other short works during this period. He made minor scientific contributions in astronomy, physics, and earth science, and wrote philosophical treatises engaging with the Leibnizian-Wolffian traditions of the day (many of these are discussed below). Kant’s primary professional goal during this period was to eventually attain the position of Professor of Logic and Metaphysics at Königsberg. He finally succeeded in 1770 (at the age of 46) when he completed his second dissertation (the first had been published in 1755), which is now referred to as the Inaugural Dissertation.

Commentators divide Kant’s career into the “pre-critical” period before 1770 and the “critical” period after. After the publication of the Inaugural Dissertation, Kant published hardly anything for more than a decade (this period is referred to as his “silent decade”). However, this was anything but a fallow period for Kant. After discovering and being shaken by the radical skepticism of Hume’s empiricism in the early 1770s, Kant undertook a massive project to respond to Hume. He realized that this response would require a complete reorientation of the most fundamental approaches to metaphysics and epistemology. Although it took much longer than initially planned, his project came to fruition in 1781 with the publication of the first edition of Critique of Pure Reason

The 1780s would be the most productive years of Kant’s career. In addition to writing the Prolegomena to Any Future Metaphysics (1783) as a sort of introduction to the Critique, Kant wrote important works in ethics (Groundwork for the Metaphysics of Morals, 1785, and Critique of Practical Reason, 1788), he applied his theoretical philosophy to Newtonian physical theory (Metaphysical Foundations of Natural Science, 1786), and he substantially revised the Critique of Pure Reason in 1787. Kant capped the decade with the publication of the third and final critique, Critique of the Power of Judgment (1790).

Although the products of the 1780s are the works for which Kant is best known, he continued to publish philosophical writings through the 1790s as well. Of note during this period are Religion within the Bounds of Mere Reason (1793), Towards Perpetual Peace (1795), Metaphysics of Morals (1797), and Anthropology from a Pragmatic Point of View (1798). The Religion was attended with some controversy, and Kant was ultimately led to promise the King of Prussia (Friedrich Wilhelm II) not to publish anything else on religion. (Kant considered the promise null and void after the king died in 1797.) During his final years, he devoted himself to completing the critical project with one final bridge to physical science. Unfortunately, the encroaching dementia of Kant’s final years prevented him from completing this book (partial drafts are published under the title Opus Postumum).

Kant never married and there are many stories that paint him as a quirky but dour eccentric. These stories do not do him justice. He was beloved by his friends and colleagues. He was consistently generous to all those around him, including his servants. He was universally considered a lively and engaging dinner guest and (later in life) host. And he was a devoted and popular teacher throughout the five decades he spent in the classroom. Although he had hoped for a small, private ceremony, when he died in 1804, age 79, his funeral was attended by the thousands who wished to pay their respects to “the sage of Königsberg.”

2. Metaphysics and Epistemology

The most important element of Kant’s mature metaphysics and epistemology is his doctrine of transcendental idealism, which received its fullest discussion in Critique of Pure Reason (1781/87). Transcendental idealism is the thesis that the empirical world that we experience (the “phenomenal” world of “appearances”) is to be distinguished from the world of things as they are in themselves. The most significant aspect of this distinction is that while the empirical world exists in space and time, things in themselves are neither spatial nor temporal. Transcendental idealism has wide-ranging consequences. On the positive side, Kant takes transcendental idealism to entail an “empirical realism,” according to which humans have direct epistemic access to the natural, physical world and can even have a priori cognition of basic features of all possible experienceable objects. On the negative side, Kant argues that we cannot have knowledge of things in themselves. Further, since traditional metaphysics deals with things in themselves, answers to the questions of traditional metaphysics (for example, regarding God or free will) can never be answered by human minds.

This section addresses the development of Kant’s metaphysics and epistemology and then summarizes the most important arguments and conclusions of Kant’s theory.

a. Pre-Critical Thought

Critique of Pure Reason, the book that would alter the course of western philosophy, was written by a man already far into his career. Unlike the later “critical period” Kant, the philosophical output of the early Kant was fully enmeshed in the German rationalist tradition, which was dominated at the time by the writings of Gottfried Leibniz (1646-1716) and Christian Wolff (1679-1754). Nevertheless, many of Kant’s concerns during the pre-critical period anticipate important aspects of his mature thought.

Kant’s first purely philosophical work was the New Elucidation of the First Principles of Metaphysical Cognition (1755). The first parts of this long essay present criticisms and revisions of the Wolffian understanding of the basic principles of metaphysics, especially the Principles of Identity (whatever is, is, and whatever is not, is not), of Contradiction (nothing can both be and not be), and of Sufficient Reason (nothing is true without a reason why it is true). In the final part, Kant defends two original principles of metaphysics. According to the “Principle of Succession,” all change in objects requires the mutual interaction of a plurality of substances. This principle is a metaphysical analogue of Newton’s principle of action and reaction, and it anticipates Kant’s argument in the Third Analogy of Experience from Critique of Pure Reason (see 2f below). According to the “Principle of Coexistence,” multiple substances can only be said to coexist within the same world if the unity of that world is grounded in the intellect of God. Although Kant would later claim that we can never have metaphysical cognition of this sort of relation between God and the world (not least of all because we can’t even know that God exists), he would nonetheless continue to be occupied with the question of how multiple distinct substances can constitute a single, unified world.

In the Physical Monadology (1756), Kant attempts to provide a metaphysical account of the basic constitution of material substance in terms of “monads.” Leibniz and Wolff had held that monads are the simple, atomic substances that constitute matter. Kant follows Wolff in rejecting Leibniz’s claim that monads are mindlike and that they do not interact with each other. The novel aspect of Kant’s account lies in his claim that each monad possesses a degree of both attractive and repulsive force, and that monads fill determinate volumes of space because of the interactions between these monads as they compress each other through their opposed repulsive forces. Thirty years later, in the Metaphysical Foundations of Natural Science (1786), Kant would develop the theory that matter must be understood in terms of interacting attractive and repulsive forces. The primary difference between the later view and the earlier is that Kant no longer appeals to monads, or simple substances at all (transcendental idealism rules out the possibility of simplest substances as constituents of matter; see 2gii below).

The final publication of Kant’s pre-critical period was On the Form and Principles of the Sensible and the Intelligible World, also referred to as the Inaugural Dissertation (1770), since it marked Kant’s appointment as Königsberg’s Professor of Logic and Metaphysics. Although Kant had not yet had the final crucial insights that would lead to the development of transcendental idealism, many of the important elements of his mature metaphysics are prefigured here. Two aspects of the Inaugural Dissertation are especially worth noting. First, in a break from his predecessors, Kant distinguishes two fundamental faculties of the mind: sensibility, which represents the world through singular “intuitions,” and understanding, which represents the world through general “concepts.” In the Inaugural Dissertation, Kant argues that sensibility represents the sensible world of “phenomena” while the understanding represents an intelligible world of “noumena.” The critical period Kant will deny that we can have any determinate knowledge of noumena, and that knowledge of phenomena requires the cooperation of sensibility and understanding together. Second, in describing the “form” of the sensible world, Kant argues that space and time are “not something objective and real,” but are rather “subjective and ideal” (2:403). The claim that space and time pertain to things only as they appear, not as they are in themselves, will be one of the central theses of Kant’s mature transcendental idealism.

b. Dogmatic Slumber, Synthetic A Priori Knowledge, and the Copernican Shift

Although the early Kant showed a complete willingness to dissent from many important aspects of the Wolffian orthodoxy of the time, Kant continued to take for granted the basic rationalist assumption that metaphysical cognition was possible. In a retrospective remark from the Prolegomena to Any Future Metaphysics (1783), Kant says that his faith in this rationalist assumption was shaken by David Hume (1711-1776), whose skepticism regarding the possibility of knowledge of causal necessary connections awoke Kant from his “dogmatic slumber” (4:260). Hume argued that we can never have knowledge of necessary connections between causes and effects because such knowledge can neither be given through the senses, nor derived a priori as conceptual truths. Kant realized that Hume’s problem was a serious one because his skepticism about knowledge of the necessity of the connection between cause and effect generalized to all metaphysical knowledge pertaining to necessity, not just causation specifically. For instance, there is the question why mathematical truths necessarily hold true in the natural world, or the question whether we can know that a being (God) exists necessarily.

The solution to Hume’s skepticism, which would form the basis of the critical philosophy, was twofold. The first part of Kant’s solution was to agree with Hume that metaphysical knowledge (such as knowledge of causation) is neither given through the senses, nor is it known a priori through conceptual analysis. Kant argued, however, that there is a third kind of knowledge which is a priori, yet which is not known simply by analyzing concepts. He referred to this as “synthetic a priori knowledge.” Where analytic judgments are justified by the semantic relations between the concepts they mention (for example, “all bachelors are unmarried”), synthetic judgments are justified by their conformity to the given object that they describe (for example, “this ball right here is red”). The puzzle posed by the notion of synthetic a priori knowledge is that it would require that an object be presented to the mind, but not be given in sensory experience.

The second part of Kant’s solution is to explain how synthetic a priori knowledge could be possible. He describes his key insight on this matter as a “Copernican” shift in his thinking about the epistemic relation between the mind and the world. Copernicus had realized that it only appeared as though the sun and stars revolved around us, and that we could have knowledge of the way the solar system really was if we took into account the fact that the sky looks the way it does because we perceivers are moving. Analogously, Kant realized that we must reject the belief that the way things appear corresponds to the way things are in themselves. Furthermore, he argued that the objects of knowledge can only ever be things as they appear, not as they are in themselves. Appealing to this new approach to metaphysics and epistemology, Kant argued that we must investigate the most basic structures of experience (that is, the structures of the way things appear to us), because the basic structures of experience will coincide with the basic structures of any objects that could possibly be experienced. In other words, if it is only possible to have experience of an object if the object conforms to the conditions of experience, then knowing the conditions of experience will give us knowledge – synthetic a priori knowledge in fact – of every possible object of experience. Kant overcomes Hume’s skepticism by showing that we can have synthetic a priori knowledge of objects in general when we take as the object of our investigation the very form of a possible object of experience. Critique of Pure Reason is an attempt to work through all of the important details of this basic philosophical strategy.

c. The Cognitive Faculties and Their Representations

Kant’s theory of the mind is organized around an account of the mind’s powers, its “cognitive faculties.” One of Kant’s central claims is that the cognitive capacities of the mind depend on two basic and fundamentally distinct faculties. First, there is “sensibility.” Sensibility is a passive faculty because its job is to receive representations through the affection of objects on the senses. Through sensibility, objects are “given” to the mind. Second, there is “understanding,” which is an active faculty whose job is to “think” (that is, apply concepts to) the objects given through sensibility.

The most basic type of representation of sensibility is what Kant calls an “intuition.” An intuition is a representation that refers directly to a singular individual object. There are two types of intuitions. Pure intuitions are a priori representations of space and time themselves (see 2d1 below). Empirical intuitions are a posteriori representations that refer to specific empirical objects in the world. In addition to possessing a spatiotemporal “form,” empirical intuitions also involve sensation, which Kant calls the “matter” of intuition (and of experience generally). (Without sensations, the mind could never have thoughts about real things, only possible ones.) We have empirical intuitions both of objects in the physical world (“outer intuitions”) and objects in our own minds (“inner intuitions”).

The most basic type of representation of understanding is the “concept.” Unlike an intuition, a concept is a representation that refers generally to indefinitely many objects. (For instance, the concept ‘cat’ on its own could refer to any and all cats, but not to any one in particular.) Concepts refer to their objects only indirectly because they depend on intuitions for reference to particular objects. As with intuitions, there are two basic types of concepts. Pure concepts are a priori representations and they characterize the most basic logical structure of the mind. Kant calls these concepts “categories.” Empirical concepts are a posteriori representations, and they are formed on the basis of sensory experience with the world. Concepts are combined by the understanding into “judgments,” which are the smallest units of knowledge. I can only have full cognition of an object in the world once I have, first, had an empirical intuition of the object, second, conceptualized this object in some way, and third, formed my conceptualization of the intuited object into a judgment. This means that both sensibility and understanding must work in cooperation for knowledge to be possible. As Kant expresses it, “Thoughts without content are empty, intuitions without concepts are blind” (A51/B75).

There are two other important cognitive faculties that must be mentioned. The first is transcendental “imagination,” which mediates between sensibility and understanding. Kant calls this faculty “blind” because we do not have introspective access to its operations. Kant says that we can at least know that it is responsible for forming intuitions in such a way that it is possible for the understanding to apply concepts to them. The other is “reason,” which operates in a way similar to the understanding, but which operates independently of the senses. While understanding combines the data of the senses into judgments, reason combines understanding’s judgments together into one coherent, unified, systematic whole. Reason is not satisfied with mere disconnected bits of knowledge. Reason wants all knowledge to form a system of knowledge. Reason is also the faculty responsible for the “illusions” of transcendent metaphysics (see 2g below).

d. Transcendental Idealism

Transcendental idealism is a theory about the relation between the mind and its objects. Three fundamental theses make up this theory: first, there is a distinction between appearances (things as they appear) and things as they are in themselves. Second, space and time are a priori, subjective conditions on the possibility of experience, and hence they pertain only to appearances, not to things in themselves. Third, we can have determinate cognition of only of things that can be experienced, hence only of appearances, not things in themselves.

A quick remark on the term “transcendental idealism” is in order. Kant typically uses the term “transcendental” when he wants to emphasize that something is a condition on the possibility of experience. So for instance, the chapter titled “Transcendental Analytic of Concepts” deals with the concepts without which cognition of an object would be impossible. Kant uses the term “idealism” to indicate that the objects of experience are mind-dependent (although the precise sense of this mind-dependence is controversial; see 2d2 below). Hence, transcendental idealism is the theory that it is a condition on the possibility of experience that the objects of experience be in some sense mind-dependent.

i. The Ideality of Space and Time

Kant argues that space and time are a priori, subjective conditions on the possibility of experience, that is, that they are transcendentally ideal. Kant grounds the distinction between appearances and things in themselves on the realization that, as subjective conditions on experience, space and time could only characterize things as they appear, not as they are in themselves. Further, the claim that we can only know appearances (not things in themselves) is a consequence of the claims that we can only know objects that conform to the conditions of experience, and that only spatiotemporal appearances conform to these conditions. Given the systematic importance of this radical claim, what were Kant’s arguments for it? What follows are some of Kant’s most important arguments for the thesis.

One argument has to do with the relation between sensations and space. Kant argues that sensations on their own are not spatial, but that they (or arguably the objects they correspond to) are represented in space, “outside and next to one another” (A23/B34). Hence, the ability to sense objects in space presupposes the a priori representation of space, which entails that space is merely ideal, hence not a property of things in themselves.

Another argument that Kant makes repeatedly during the critical period can be called the “argument from geometry.” Its two premises are, first, that the truths of geometry are necessary truths, and thus a priori truths, and second, that the truths of geometry are synthetic (because these truths cannot be derived from an analysis of the meanings of geometrical concepts). If geometry, which is the study of the structure of space, is synthetic a priori, then its object – space – must be a mere a priori representation and not something that pertains to things in themselves. (Kant’s theory of mathematical cognition is discussed further in 3b below.)

Many commentators have found these arguments less than satisfying because they depend on the questionable assumption that if the representations of space and time are a priori they thereby cannot be properties of things in themselves. “Why can’t it be both?” many want to ask. A stronger argument appears in Kant’s discussion of the First and Second Antinomies of Pure Reason (discussed below, 2g2). There Kant argues that if space and time were things in themselves or even properties of things in themselves, then one could prove that space and time both are and are not infinitely large, and that matter in space both is and is not infinitely divisible. In other words, the assumption that space and time are transcendentally real instead of transcendentally ideal leads to a contradiction, and thus space and time must be transcendentally ideal.

ii. Appearances and Things in Themselves

How Kant’s distinction between appearances and things in themselves should be understood is one of the most controversial topics in the literature. It is a question of central importance because how one understands this distinction determines how one will understand the entire nature of Kantian idealism. The following briefly summarizes the main interpretive options, but it does not take a stand on which is correct.

According to “two-world” interpretations, the distinction between appearances and things in themselves is to be understood in metaphysical and ontological terms. Appearances (and hence the entire physical world that we experience) comprise one set of entities, and things in themselves are an ontologically distinct set of entities. Although things in themselves may somehow cause us to have experience of appearances, the appearances we experience are not things in themselves.

According to “one-world” or “two-aspect” interpretations, the distinction between appearances and things in themselves is to be understood in epistemological terms. Appearances are ontologically the very same things as things in themselves, and the phrase “in themselves” simply means “not considered in terms of their epistemic relation to human perceivers.”

A common objection against two-world interpretations is that they may make Kant’s theory too similar to Berkeley’s immaterialist idealism (an association from which Kant vehemently tried to distance himself), and they seem to ignore Kant’s frequent characterization of the appearance/thing in itself distinction in terms of different epistemic standpoints. And a common objection against one-world interpretations is that they may trivialize some of the otherwise revolutionary aspects of Kant’s theory, and they seem to ignore Kant’s frequent characterization of the appearance/thing in itself distinction in seemingly metaphysical terms. There have been attempts at interpretations that are intermediate between these two options. For instance, some have argued that Kant only acknowledges one world, but that the appearance/thing in itself distinction is nevertheless metaphysical, not merely epistemological.

e. The Deduction of the Categories

After establishing the ideality of space and time and the distinction between appearances and things in themselves, Kant goes on to show how it is possible to have a priori cognition of the necessary features of appearances. Cognizing appearances requires more than mere knowledge of their sensible form (space and time); it also requires that we be able to apply certain concepts (for example, the concept of causation) to appearances. Kant identifies the most basic concepts that we can use to think about objects as the “pure concepts of understanding,” or the “categories.”

There are twelve categories in total, and they fall into four groups of three:

The task of the chapter titled “Transcendental Deduction of the Categories” is to show that these categories can and must be applied in some way to any object that could possibly be an object of experience. The argument of the Transcendental Deduction is one of the most important moments in the Critique, but it is also one of the most difficult, complex, and controversial arguments in the book. Hence, it will not be possible to reconstruct the argument in any detail here. Instead, Kant’s most important claims and moves in the Deduction are described.

Kant’s argument turns on conceptions of self-consciousness (or what he calls “apperception”) as a condition on the possibility of experiencing the world as a unified whole. Kant takes it to be uncontroversial that we can be aware of our representations as our representations. It is not just that I can have the thoughts ‘P’ or ‘Q’; I am also always able to ascribe these thoughts to myself: ‘I think P’ and ‘I think Q’. Further, we are also able to recognize that it is the same I that does the thinking in both cases. Thus, we can recognize that ‘I think both P and Q’. In general, all of our experience is unified because it can be ascribed to the one and same I, and so this unity of experience depends on the unity of the self-conscious I. Kant next asks what conditions must obtain in order for this unity of self-consciousness to be possible. His answer is that we must be able to differentiate between the I that does the thinking and the object that we think about. That is, we must be able to distinguish between subjective and objective elements in our experience. If we could not make such a distinction, then all experience would just be so many disconnected mental happenings: everything would be subjective and there would be no “unity of apperception” that stands over and against the various objects represented by the I. So next Kant needs to explain how we are able to differentiate between the subjective and objective elements of experience. His answer is that a representation is objective when the subject is necessitated in representing the object in a certain way, that is, when it is not up to the free associative powers of my imagination to determine how I represent it. For instance, whether I think a painting is attractive or whether it calls to mind an instance from childhood depends on the associative activity of my own imagination; but the size of the canvas and the chemical composition of the pigments is not up to me: insofar as I represent these as objective features of the painting, I am necessitated in representing them in a certain way. In order for a representational content to be necessitated in this way, according to Kant, is for it to be subject to a “rule.” The relevant rules that Kant has in mind are the conditions something must satisfy in order for it to be represented as an object at all. And these conditions are precisely the concepts laid down in the schema of the categories, which are the concepts of an “object in general.” Hence, if I am to have experience at all, I must conceptualize objects in terms of the a priori categories.

Kant’s argument in the Deduction is a “transcendental argument”: Kant begins with a premise accepted by everyone, but then asks what conditions must have been met in order for this premise to be true. Kant assumed that we have a unified experience of the many objects populating the world. This unified experience depends on the unity of apperception. The unity of apperception enables the subject to distinguish between subjective and objective elements in experience. This ability, in turn, depends on representing objects in accordance with rules, and the rules in question are the categories. Hence, the only way we can explain the fact that we have experience at all is by appeal to the fact that the categories apply to the objects of experience.

It is worth emphasizing how truly radical the conclusion of the Transcendental Deduction is. Kant takes himself to have shown that all of nature is subject to the rules laid down by the categories. But these categories are a priori: they originate in the mind. This means that the order and regularity we encounter in the natural world is made possible by the mind’s own construction of nature and its order. Thus the conclusion of the Transcendental Deduction parallels the conclusion of the Transcendental Aesthetic: where the latter had shown that the forms of sensibility (space and time) originate in the mind and are imposed on the world, the former shows that the forms of understanding (the categories) also originate in the mind and are imposed on the world.

f. Theory of Experience

The Transcendental Deduction showed that it is necessary for us to make use of the categories in experience, but also that we are justified in making use of them. In the following series of chapters (together labeled the Analytic of Principles) Kant attempts to leverage the results of the Deduction and prove that there are transcendentally necessary laws that every possible object of experience must obey. He refers to these as “principles of pure understanding.” These principles are synthetic a priori in the sense defined above (see 2b), and they are transcendental conditions on the possibility of experience.

The first two principles correspond to the categories of quantity and quality. First, Kant argues that every object of experience must have a determinate spatial shape and size and a determinate temporal duration (except mental objects, which have no spatial determinations). Second, Kant argues that every object of experience must contain a “matter” that fills out the object’s extensive magnitude. This matter must be describable as an “intensive magnitude.” Extensive magnitudes are represented through the intuition of the object (the form of the representation) and intensive magnitudes are represented by the sensations that fill out the intuition (the matter of the representation).

The next three principles are discussed in an important, lengthy chapter called the Analogies of Experience. They derive from the relational categories: substance, causality, and community. According to the First Analogy, experience will always involve objects that must be represented as substances. “Substance” here is to be understood in terms of an object that persists permanently as a “substratum” and which is the bearer of impermanent “accidents.” According to the Second Analogy, every event must have a cause. One event is said to be the cause of another when the second event follows the first in accordance with a rule. And according to the Third Analogy (which presupposes the first two), all substances stand in relations of reciprocal interaction with each other. That is, any two pieces of material substance will effect some degree of causal influence on each other, even if they are far apart.

The principles of the Analogies of Experience are important metaphysical principles, and if Kant’s arguments for them are successful, they mark significant advances in the metaphysical investigation of nature. The First Analogy is a form of the principle of the conservation of matter: it shows that matter can never be created or annihilated by natural means, it can only be altered. The Second Analogy is a version of the principle of sufficient reason applied to experience (causes being sufficient reasons for their effects), and it represents Kant’s refutation of Hume’s skepticism regarding causation. Hume had argued that we can never have knowledge of necessary connections between events; rather, we can only perceive certain types of events to be constantly conjoined with other types of events. In arguing that events follow each other in accordance with rules, Kant has shown how we can have knowledge of necessary connections between events above and beyond their mere constant conjunction. Lastly, Kant probably intended the Third Analogy to establish a transcendental, a priori basis for something like Newton’s law of universal gravitation, which says that no matter how far apart two objects are they will exert some degree of gravitational influence on each other.

The Postulates of Empirical Thinking in General contains the final set of principles of pure understanding and they derive from the modal categories (possibility, actuality, necessity). The Postulates define the different ways to represent the modal status of objects, that is, what it is for an object of experience to be possible, actual, or necessary.

The most important passage from the Postulates chapter is the Refutation of Idealism, which is a refutation of external world skepticism that Kant added to the 1787 edition of the Critique. Kant had been annoyed by reviews of the first edition that unfavorably compared his transcendental idealism with Berkeley’s immaterialist idealism. In the Refutation, Kant argues that his system entails not just that an external (that is, spatial) world is possible (which Berkeley denied), but that we can know it is real (which Descartes and others questioned). Kant’s argumentative strategy in the Refutation is ingenious but controversial. Where the skeptics assume that we have knowledge of the states of our own minds, but say that we cannot be certain that an external world corresponds to these states, Kant turns the tables and argues that we would not have knowledge of the states of our own minds (specifically, the temporal order in which our ideas occur) if we were not simultaneously aware of permanent substances in space, outside of the mind. The precise structure of Kant’s argument, as well as the question how successful it is, continues to be a matter of heated debate in the literature.

g. Critique of Transcendent Metaphysics

One of the most important upshots of Kant’s theory of experience is that it is possible to have knowledge of the world because the world as we experience it conforms to the conditions on the possibility of experience. Accordingly, Kant holds that there can be knowledge of an object only if it is possible for that object to be given in an experience. This aspect of the epistemological condition of the human subject entails that there are important areas of inquiry about which we would like to have knowledge, but cannot. Most importantly, Kant argued that transcendent metaphysics, that is, philosophical inquiry into “supersensible” objects that are not a part of the empirical world, marks a philosophical dead end. (Note: There is a subtle but important difference between the terms “transcendental” and “transcendent” for Kant. “Transcendental” describes conditions on the possibility of experience. “Transcendent” describes unknowable objects in the “noumenal” realm of things in themselves.)

Kant calls the basic concepts of metaphysical inquiry “ideas.” Unlike concepts of the understanding, which correspond to possible objects that can be given in experience, ideas are concepts of reason, and they do not correspond to possible objects of experience. The three most important ideas with which Kant is concerned in the Transcendental Dialectic are the soul, the world (considered as a totality), and God. The peculiar thing about these ideas of reason is that reason is led by its very structure to posit objects corresponding to these ideas. It cannot help but do this because reason’s job is to unify cognitions into a systematic whole, and it finds that it needs these ideas of the soul, the world, and God, in order to complete this systematic unification. Kant refers to reason’s inescapable tendency to posit unexperienceable and hence unknowable objects corresponding to these ideas as “transcendental illusion.”

Kant presents his analysis of transcendental illusion and his critique of transcendent metaphysics in the series of chapters titled “Transcendental Dialectic,” which takes up the majority of the second half of Critique of Pure Reason. This section summarizes Kant’s most important arguments from the Dialectic.

i. The Soul (Paralogisms of Pure Reason)

Kant addresses the metaphysics of the soul – an inquiry he refers to as “rational psychology” – in the Paralogisms of Pure Reason. Rational psychology, as Kant describes it, is the attempt to prove metaphysical theses about the nature of the soul through an analysis of the simple proposition, “I think.” Many of Kant’s rationalist predecessors and contemporaries had thought that reflection on the notion of the “I” in the proposition “I think” would reveal that the I is necessarily a substance (which would mean that the I is a soul), an indivisible unity (which some would use to prove the immortality of the soul), self-identical (which is relevant to questions regarding personal identity), and distinct from the external world (which can lead to external-world skepticism). Kant argues that such reasoning is the result of transcendental illusion.

Transcendental illusion in rational psychology arises when the mere thought of the I in the proposition “I think” is mistaken for a cognition of the I as an object. (A cognition involves both intuition and concept, while a mere thought involves only concept.) For instance, consider the question whether we can cognize the I as a substance (that is, as a soul). On the one hand, something is cognized as a substance when it is represented only as the subject of predication and is never itself the predicate of some other subject. The I of “I think” is always represented as subject (the I’s various thoughts are its predicates). On the other hand, something can only be cognized as a substance when it is given as a persistent object in an intuition (see 2f above), and there can be no intuition of the I itself. Hence although we cannot help but think of the I as a substantial soul, we can never have cognition of the I as a substance, and hence knowledge of the existence and nature of the soul is impossible.

ii. The World (Antinomies of Pure Reason)

The Antinomies of Pure Reason deal with “rational cosmology,” that is, with metaphysical inquiry into the nature of the cosmos considered as a totality. An “antinomy” is a conflict of reason with itself. Antinomies arise when reason seems to be able to prove two opposed and mutually contradictory propositions with apparent certainty. Kant discusses four antinomies in the first Critique (he uncovers other antinomies in later writings as well). The First Antinomy shows that reason seems to be able to prove that the universe is both finite and infinite in space and time. The Second Antinomy shows that reason seems to be able to prove that matter both is and is not infinitely divisible into ever smaller parts. The Third Antinomy shows that reason seems to be able to prove that free will cannot be a causally efficacious part of the world (because all of nature is deterministic) and yet that it must be such a cause. And the Fourth Antinomy shows that reason seems to be able to prove that there is and there is not a necessary being (which some would identify with God).

In all four cases, Kant attempts to resolve these conflicts of reason with itself by appeal to transcendental idealism. The claim that space and time are not features of things in themselves is used to resolve the First and Second Antinomies. Since the empirical world in space and time is identified with appearances, and since the world as a totality can never itself be given as a single appearance, there is no determinate fact of the matter regarding the size of the universe: It is neither determinately finite nor determinately infinite; rather, it is indefinitely large. Similarly, matter has neither simplest atoms (or “monads”) nor is it infinitely divided; rather, it is indefinitely divisible.

The distinction between appearances and things in themselves is used to resolve the Third and Fourth Antinomies. Although every empirical event experienced within the realm of appearance has a deterministic natural cause, it is at least logically possible that freedom can be a causally efficacious power at the level of things in themselves. And although every empirical object experienced within the realm of appearance is a contingently existing entity, it is logically possible that there is a necessary being outside the realm of appearance which grounds the existence of the contingent beings within the realm of appearance. It must be kept in mind that Kant has not claimed to demonstrate the existence of a transcendent free will or a transcendent necessary being: Kant denies the possibility of knowledge of things in themselves. Instead, Kant only takes himself to have shown that the existence of such entities is logically possible. In his moral theory, however, Kant will offer an argument for the actuality of freedom (see 5c below).

iii. God (Ideal of Pure Reason)

The Ideal of Pure Reason addresses the idea of God and argues that it is impossible to prove the existence of God. The argumentation in the Ideal of Pure Reason was anticipated in Kant’s The Only Possible Argument in Support of the Existence of God (1763), making this aspect of Kant’s mature thought one of the most significant remnants of the pre-critical period.

Kant identifies the idea of God with the idea of an ens realissimum, or “most real being.” This most real being is also considered by reason to be a necessary being, that is, something which exists necessarily instead of merely contingently. Reason is led to posit the idea of such a being when it reflects on its conceptions of finite beings with limited reality and infers that the reality of finite beings must derive from and depend on the reality of the most infinitely perfect being. Of course, the fact that reason necessarily thinks of a most real, necessary being does not entail that such a being exists. Kant argues that there are only three possible arguments for the existence of such a being, and that none is successful.

According to the ontological argument for the existence of God (versions of which were proposed by St. Anselm (1033-1109) and Descartes (1596-1650), among others), God is the only being whose essence entails its existence. Kant famously objects that this argument mistakenly treats existence as a “real predicate.” According to Kant, when I make an assertion of the form “x is necessarily F,” all I can mean is that “if x exists, then x must be F.” Thus when proponents of the ontological argument claim that the idea of God entails that “God necessarily exists,” all they can mean is that “if God exists, then God exists,” which is an empty tautology.

Kant also offers lengthy criticisms of the cosmological argument (the existence of contingent beings entails the existence of a necessary being) and the physico-theological argument, which is also referred to as the “argument from design” (the order and purposiveness in the empirical world can only be explained by a divine creator). Kant argues that both of these implicitly depend on the argumentation of the ontological argument pertaining to necessary existence, and since it fails, they fail as well.

Although Kant argues in the Transcendental Dialectic that we cannot have cognition of the soul, of freedom of the will, nor of God, in his ethical writings he will complicate this story and argue that we are justified in believing in these things (see 5c below).

3. Philosophy of Mathematics

The distinction between analytic and synthetic judgments (see 2b above) is necessary for understanding Kant’s theory of mathematics. Recall that an analytic judgment is one where the truth of the judgment depends only on the relation between the concepts used in the judgment. The truth of a synthetic judgment, by contrast, requires that an object be “given” in sensibility and that the concepts used in the judgment be combined in the object. In these terms, most of Kant’s predecessors took mathematical truths to be analytic truths. Kant, by contrast argued that mathematical knowledge is synthetic. It may seem surprising that one’s knowledge of mathematical truths depends on an object being given in sensibility, for we surely don’t arrive at mathematical knowledge by empirical means. Recall, however, that a judgment can be both synthetic yet a priori. Like the judgments of the necessary structures of experience, mathematics is also synthetic a priori according to Kant.

To make this point, Kant considers the proposition ‘7+5=12’. Surely, this proposition is a priori: I can know its truth without doing empirical experiments to see what happens when I put seven things next to five other things. More to the point, ‘7+5=12’ must be a priori because it is a necessary truth, and empirical judgments are always merely contingent according to Kant. Yet at the same time, the judgment is not analytic because, “The concept of twelve is by no means already thought merely by my thinking of that unification of seven and five, and no matter how long I analyze my concept of such a possible sum I will still not find twelve in it” (B15).

If mathematical knowledge is synthetic, then it depends on objects being given in sensibility. And if it is a priori, then these objects must be non-empirical objects. What sort of objects does Kant have in mind here? The answer lies in Kant’s theory of the pure forms of intuition (space and time). Recall that an intuition is a singular, immediate representation of an individual object (see 2c above). Empirical intuitions represent sensible objects through sensation, but pure intuitions are a priori representations of space and time as such. These pure intuitions of space and time provide the objects of mathematics through what Kant calls a “construction” of concepts in pure intuition. As he puts it, “to construct a concept means to exhibit a priori the intuition corresponding to it” (A713/B741). A mathematical concept (for example, ‘triangle’) can be thought of as a rule for how to make an object that corresponds to that concept. Thus if ‘triangle’ is defined as ‘three-sided, two-dimensional shape’, then I construct a triangle in pure intuition when I imagine three lines coming together to form a two-dimensional figure. These pure constructions in intuition can be used to arrive at (synthetic, a priori) mathematical knowledge. Consider the proposition, ‘The angles of a triangle sum to 180 degrees’. When I construct a triangle in intuition in accordance with the rule ‘three-sided, two-dimensional shape’, then the constructed triangle will in fact have angles that sum to 180 degrees. And this will be true irrespective of what particular triangle I constructed (isosceles, scalene, and so forth.). Kant holds that all mathematical knowledge is derived in this fashion: I take a concept, construct it in pure intuition, and then determine what features of the constructed intuition are necessarily true of it.

4. Natural Science

In addition to his work in pure theoretical philosophy, Kant displayed an active interest in the natural sciences throughout his career. Most of his important scientific contributions were in the physical sciences (including not just physics proper, but also earth sciences and cosmology). In Critique of the Power of Judgment (1790) he also presented a lengthy discussion of the philosophical basis of the study of biological entities.

In general, Kant thought that a body of knowledge could only count as a science in the true sense if it could admit of mathematical description and an a priori principle that could be “presented a priori in intuition” (4:471). Hence, Kant was pessimistic about the possibility of empirical psychology ever amounting to a true science. Kant even thought it might be the case that “chemistry can be nothing more than a systematic art or experimental doctrine, but never a proper science” (4:471).

This section focuses primarily on Kant’s physics (4a), but it also lists several of Kant’s other scientific contributions (4b).

a. Physics

Kant’s interest in physical theory began early. His first published work, Thoughts on the True Estimation of Living Forces (1749) was an inquiry into some foundational problems in physics, and it entered into the “vis viva” (“living forces”) debate between Leibniz and the Cartesians regarding how to quantify force in moving objects (for the most part, Kant sided with the Leibnizians). A few years later, Kant wrote the Physical Monadology (1756), which dealt with other foundational questions in physics (see 2a above.)

Kant’s mature physical theory is presented in its fullest form in Metaphysical Foundations of Natural Science (1786). This theory can be understood as an outgrowth and consequence of the transcendental theory of experience articulated in Critique of Pure Reason (see 2f above). Where the Critique had shown the necessary conceptual forms to which all possible objects of experience must conform, the Metaphysical Foundations specifies in greater detail what exactly the physical constitution of these objects must be like. The continuity with the theory of experience from the Critique is implicit in the very structure of the Metaphysical Foundations. Just as Kant’s theory of experience was divided into four sections corresponding to the four groups of categories (quantity, quality, relation, modality), the body of the Metaphysical Foundations is also divided along the same lines.

Like the theory of the Physical Monadology, the Metaphysical Foundations presents a “dynamical” theory of matter according to which material substance is constituted by an interaction between attractive and repulsive forces. The basic idea is that each volume of material substance possesses a brute tendency to expand and push away other volumes of substance (this is repulsive force) and each volume of substance possesses a brute tendency to contract and to attract other volumes of substance (this is attractive force). The repulsive force explains the solidity and impenetrability of bodies while the attractive force explains gravitation (and presumably also phenomena such as magnetic attraction). Further, any given volume of substance will possess these forces to a determinate degree: the matter in a volume can be more or less repulsive and more or less attractive. The ratio of attractive and repulsive force in a substance will determine how dense the body is. In this respect, Kant’s theory marks a sharp break from those of his mechanist predecessors. (Mechanists believed that all physical phenomena could be explained by appeal to the sizes, shapes, and velocities of material bodies.) The Cartesians thought that there is no true difference in density and that the appearance of differences in density could be explained by appeal to porosity in the body. Similarly, the atomists thought that density could be explained by differences in the ratio of atoms to void in any given volume. Thus for both of these theories, any time there was a volume completely filled in with material substance (no pores, no void), there could only be one possible value for mass divided by volume. According to Kant’s theory, by contrast, two volumes of equal size could be completely filled in with matter and yet differ in their quantity of matter (their mass), and hence differ in their density (mass divided by volume). Another consequence of Kant’s theory that puts him at odds with the Cartesians and atomists was his claim that matter is elastic, hence compressible: a completely filled volume of matter could be reduced in volume while the quantity of matter remained unchanged (hence it would become denser). The Cartesians and atomists took this to be impossible.

At the end of his career, Kant worked on a project that was supposed to complete the connection between the transcendental philosophy and physics. Among other things, Kant attempted to give a transcendental, a priori demonstration of the existence of a ubiquitous “ether” that permeates all of space. Although Kant never completed a manuscript for this project (due primarily to the deterioration of his mental faculties at the end of his life), he did leave behind many notes and partial drafts. Many of these notes and drafts have been edited and published under the title Opus Postumum.

b. Other Scientific Contributions

In addition to his major contributions to physics, Kant published various writings addressing different issues in the natural sciences. Early on he showed a great deal of interest in geology and earth science, as evidenced by the titles of some of his shorter essays: The question, Whether the Earth is Ageing, Considered from a Physical Point of View (1754); On the Causes of Earthquakes on the Occasion of the Calamity that Befell the Western Countries of Europe Towards the End of Last Year (1756); Continued Observations on the Earthquakes that Have been Experienced for Some Time (1756); New Notes to Explain the Theory of the Winds, in which, at the Same Time, He Invites Attendance to his Lectures (1756).

In 1755, he wrote the Succinct Exposition of Some Meditations on Fire (which he submitted to the university as a Master’s Thesis). There he argued, against the Cartesian mechanists, that physical phenomena such as fire can only be explained by appeal to elastic (that is, compressible) matter, which anticipated the mature physics of his Metaphysical Foundations (see 4a above).

One of Kant’s most lasting scientific contributions came from his early work in cosmology. In his Universal Natural History and Theory of the Heavens (1755), Kant gave a mechanical explanation of the formation of the solar system and the galaxies in terms of the principles of Newtonian physics. (A shorter version of the argument also appears in The Only Possible Argument in Support of a Demonstration of the Existence of God from 1763.) Kant’s hypothesis was that a single mechanical process could explain why we observe an orbital motion of smaller bodies around larger ones at many different scales in the cosmos (moons around planets, planets around stars, and stars around the center of the galaxy). He proposed that at the beginning of creation, all matter was spread out more or less evenly and randomly in a kind of nebula. Since the various bits of matter all attracted each other through gravitation, bodies would move towards each other within local regions to form larger bodies. The largest of these became stars, and the smaller ones became moons or planets. Because everything was already in motion (due to the gravitational attraction of everything to everything), and because all objects would be pulled towards the center of mass of their local region (for example, the sun at the center of the solar system, or a planet at the center of its own smaller planetary system), the motion of objects within that region would become orbital motions (as described by Newton’s theory of gravity). Although the Universal Natural History was not widely read for most of Kant’s lifetime (due primarily to Kant’s publisher going bankrupt while the printed books remained in a warehouse), in 1796 Pierre-Simon Laplace (1749-1827) proposed a remarkably similar version of the same theory, and this caused renewed interest in Kant’s book. Today the theory is referred to as the “Kant-Laplace Nebular Hypothesis,” and a modified version of this theory is still held today.

Finally, in the second half of Critique of the Power of Judgment (1790), Kant discusses the philosophical foundations of biology by way of an analysis of teleological judgments. While in no way a fully worked out biological theory per se, Kant connects his account of biological cognition in interesting ways to other important aspects of his philosophical system. First, natural organisms are essentially teleological, or “purposive.” This purposiveness is manifested through the organic structure of the organism: its many parts all work together to constitute the whole, and any one part only makes sense in terms of its relation to the healthy functioning of the whole. For instance, the teeth of an animal are designed to chew the kind of food that the animal is equipped to hunt or forage and that it is suited to digest. In this respect, biological entities bear a strong analogy to great works of art. Great works of art are also organic insofar as the parts only make sense in the context of the whole, and art displays a purposiveness similar to that found in nature (see section 7 below). Second, Kant discusses the importance of biology with respect to theological cognition. While he denies that the apparent design behind the purposiveness of organisms can be used as a proof for God’s existence (see 2g3 above), he does think that the purposiveness found in nature provides a sort of hint that there is an intelligible principle behind the observable, natural world, and hence that the ultimate purpose of all of nature is a rational one. In connection with his moral theory and theory of human history (see sections 5 and 6 below), Kant will argue that the teleology of nature can be understood as ultimately directed towards a culmination in a fully rational nature, that is, humanity in its (future) final form.

5. Moral Theory

Kant’s moral theory is organized around the idea that to act morally and to act in accordance with reason are one and the same. In virtue of being a rational agent (that is, in virtue of possessing practical reason, reason which is interested and goal-directed), one is obligated to follow the moral law that practical reason prescribes. To do otherwise is to act irrationally. Because Kant places his emphasis on the duty that comes with being a rational agent who is cognizant of the moral law, Kant’s theory is considered a form of deontology (deon– comes from the Greek for “duty” or “obligation”).

Like his theoretical philosophy, Kant’s practical philosophy is a priori, formal, and universal: the moral law is derived non-empirically from the very structure of practical reason itself (its form), and since all rational agents share the same practical reason, the moral law binds and obligates everyone equally. So what is this moral law that obligates all rational agents universally and a priori? The moral law is determined by what Kant refers to as the Categorical Imperative, which is the general principle that demands that one respect the humanity in oneself and in others, that one not make an exception for oneself when deliberating about how to act, and in general that one only act in accordance with rules that everyone could and should obey.

Although Kant insists that the moral law is equally binding for all rational agents, he also insists that the bindingness of the moral law is self-imposed: we autonomously prescribe the moral law to ourselves. Because Kant thinks that the kind of autonomy in question here is only possible under the presupposition of a transcendentally free basis of moral choice, the constraint that the moral law places on an agent is not only consistent with freedom of the will, it requires it. Hence, one of the most important aspects of Kant’s project is to show that we are justified in presupposing that our morally significant choices are grounded in a transcendental freedom (the very sort of freedom that Kant argued we could not prove through mere “theoretical” or “speculative” reason; see 2gii above).

This section aims to explain the structure and content of Kant’s moral theory (5a-b), and also Kant’s claims that belief in freedom, God, and the immortality of the soul are necessary “postulates” of practical reason (5c). (On the relation between Kant’s moral theory and his aesthetic theory, see 7c below.)

a. The Good Will and Duty

Kant lays out the case for his moral theory in Groundwork for the Metaphysics of Morals (1785), Critique of Practical Reason (also known as the “Second Critique”; 1788), and the Metaphysics of Morals (1797). His arguments from the Groundwork are his most well-known and influential, so the following focuses primarily on them.

Kant begins his argument from the premise that a moral theory must be grounded in an account of what is unconditionally good. If something is merely conditionally good, that is, if its goodness depends on something else, then that other thing will either be merely conditionally good as well, in which case its goodness depends on yet another thing, or it will be unconditionally good. All goodness, then, must ultimately be traceable to something that is unconditionally good. There are many things that we typically think of as good but that are not truly unconditionally good. Beneficial resources such as money or power are often good, but since these things can be used for evil purposes, their goodness is conditional on the use to which they are put. Strength of character is generally a good thing, but again, if someone uses a strong character to successfully carry out evil plans, then the strong character is not good. Even happiness, according to Kant, is not unconditionally good. Although all humans universally desire to be happy, if someone is happy but does not deserve their happiness (because, for instance, their happiness results from stealing from the elderly), then it is not good for the person to be happy. Happiness is only good on the condition that the happiness is deserved.

Kant argues that there is only one thing that can be considered unconditionally good: a good will. A person has a good will insofar as they form their intentions on the basis of a self-conscious respect for the moral law, that is, for the rules regarding what a rational agent ought to do, one’s duty. The value of a good will lies in the principles on the basis of which it forms its intentions; it does not lie in the consequences of the actions that the intentions lead to. This is true even if a good will never leads to any desirable consequences at all: “Even if… this will should wholly lack the capacity to carry out its purpose… then, like a jewel, it would still shine by itself, as something that has its full worth in itself” (4:393). This is in line with Kant’s emphasis on the unconditional goodness of a good will: if a will were evaluated in terms of its consequences, then the goodness of the will would depend on (that is, would be conditioned on) those consequences. (In this respect, Kant’s deontology is in stark opposition to consequentialist moral theories, which base their moral evaluations on the consequences of actions rather than the intentions behind them.)

b. The Categorical Imperative

If a good will is one that forms its intentions on the basis of correct principles of action, then we want to know what sort of principles these are. A principle that commands an action is called an “imperative.” Most imperatives are “hypothetical imperatives,” that is, they are commands that hold only if certain conditions are met. For instance: “if you want to be a successful shopkeeper, then cultivate a reputation for honesty.” Since hypothetical imperatives are conditioned on desires and the intended consequences of actions, they cannot serve as the principles that determine the intentions and volitions of an unconditionally good will. Instead, we require what Kant calls a “categorical imperative.” Where hypothetical imperatives take the form, “if y is desired/intended/sought, do x,” categorical imperatives simply take the form, “do x.” Since a categorical imperative is stripped of all reference to the consequences of an action, it is thereby stripped of all determinate content, and hence it is purely formal. And since it is unconditional, it holds universally. Hence a categorical imperative expresses only the very form of a universally binding law: “nothing is left but the conformity of actions as such with universal law” (4:402). To act morally, then, is to form one’s intentions on the basis of the very idea of a universal principle of action.

This conception of a categorical imperative leads Kant to his first official formulation of the categorical imperative itself: “act only in accordance with that maxim through which you can at the same time will that it become a universal law” (4:421). A maxim is a general rule that can be used to determine particular courses of actions in particular circumstances. For instance, the maxim “I shall lie when it will get me out of trouble” can be used to determine the decision to lie about an adulterous liaison. The categorical imperative offers a decision procedure for determining whether a given course of action is in accordance with the moral law. After determining what maxim one would be basing the action in question on, one then asks whether it would be possible, given the power (in an imagined, hypothetical scenario), to choose that everyone act in accordance with that same maxim. If it is possible to will that everyone act according to that maxim, then the action under consideration is morally permissible. If it is not possible to will that everyone act according to that maxim, the action is morally impermissible. Lying to cover up adultery is thus immoral because one cannot will that everyone act according to the maxim, “I shall lie when it will get me out of trouble.” Note that it is not simply that it would be undesirable for everyone to act according to that maxim. Rather, it would be impossible. Since everyone would know that everyone else was acting according to that maxim, there would never be the presupposition that anyone was telling the truth; the very act of lying, of course, requires such a presupposition on the part of the one being lied to. Hence, the state of affairs where everyone lies to get out of trouble can never arise, so it cannot be willed to be a universal law. It fails the test of the categorical imperative.

The point of Kant’s appeal to the universal law formulation of the categorical imperative is to show that an action is morally permissible only if the maxim on which the action is based could be affirmed as a universal law that everyone obeys without exception. The mark of immorality, then, is that one makes an exception for oneself. That is, one acts in a way that they would not want everyone else to. When someone chooses to lie about an adulterous liaison, one is implicitly thinking, “in general people should tell the truth, but in this case I will be the exception to the rule.”

Kant’s first formulation of the categorical imperative describes it in terms of the very form of universal law itself. This formal account abstracts from any specific content that the moral law might have for living, breathing human beings. Kant offers a second formulation to address the material side of the moral law. Since the moral law has to do with actions, and all actions are by definition teleological (that is, goal-directed), a material formulation of the categorical imperative will require an appeal to the “ends” of human activity. Some ends are merely instrumental, that is, they are sought only because they serve as “means” towards further ends. Kant argues that the moral law must be aimed at an end that is not merely instrumental, but is rather an end in itself. Only rational agents, according to Kant, are ends in themselves. To act morally is thus to respect rational agents as ends in themselves. Accordingly, the categorical imperative can be reformulated as follows: “So act that you use humanity, whether in your own person or in the person of any other, always at the same time as an end, never merely as a means” (4:429). The basic idea here is that it is immoral to treat someone as a thing of merely instrumental value; persons have an intrinsic (non-instrumental) value, and the moral law demands that we respect this intrinsic value. To return to the example of the previous paragraphs, it would be wrong to lie about an adulterous liaison because by withholding the truth one is manipulating the other person to make things easier for oneself; this sort of manipulation, however, amounts to treating the other as a thing (as a mere means to the comfort of not getting in trouble), and not as a person deserving of respect and entitled to the truth.

The notion of a universal law provides the form of the categorical imperative and rational agents as ends in themselves provide the matter. These two sides of the categorical imperative are combined into yet a third formulation, which appeals to the notion of a “kingdom of ends.” A kingdom of ends can be thought of as a sort of perfectly just utopian ideal in which all citizens of this kingdom freely respect the intrinsic worth of the humanity in all others because of an autonomously self-imposed recognition of the bindingness of the universal moral law for all rational agents. The third formulation of the categorical imperative is simply the idea that one should act in whatever way a member of this perfectly just society would act: “act in accordance with the maxims of a member giving universal laws for a merely possible kingdom of ends” (4:439). The idea of a kingdom of ends is an ideal (hence the “merely possible”). Although humanity may never be able to achieve such a perfect state of utopian coexistence, we can at least strive to approximate this state to an ever greater degree.

c. Postulates of Practical Reason

In Critique of Pure Reason, Kant had argued that although we can acknowledge the bare logical possibility that humans possess free will, that there is an immortal soul, and that there is a God, he also argued that we can never have positive knowledge of these things (see 2g above). In his ethical writings, however, Kant complicates this story. He argues that despite the theoretical impossibility of knowledge of these objects, belief in them is nevertheless a precondition for moral action (and for practical cognition generally). Accordingly, freedom, immortality, and God are “postulates of practical reason.” (The following discussion draws primarily on Critique of Practical Reason.)

We will start with freedom. Kant argues that morality and the obligation that comes with it are only possible if humans have free will. This is because the universal laws prescribed by the categorical imperative presuppose autonomy (autos = self; nomos = law). To be autonomous is to be the free ground of one’s own principles, or “laws” of action. Kant argues that if we presuppose that humans are rational and have free will, then his entire moral theory follows directly. The problem, however, lies in justifying the belief that we are free. Kant had argued in the Second Analogy of Experience that every event in the natural world has a “determining ground,” that is, a cause, and so all human actions, as natural events, themselves have deterministic causes (see 2f above). The only room for freedom of the will would lie in the realm of things in themselves, which contains the noumenal correlate of my phenomenal self. Since things in themselves are unknowable, I can never look to them to get evidence that I possess transcendental freedom. Kant gives at least two arguments to justify belief in freedom as a precondition of his moral theory. (There is a great deal of controversy among commentators regarding the exact form of his arguments, as well as their success. It will not be possible to adjudicate those disputes in any detail here. See Section 10 (References and Further Readings) for references to some of these commentaries.)

In the Groundwork, Kant suggests that the presupposition that we are free follows as a consequence of the fact that we have practical reason and that we think of ourselves as practical agents. Any time I face a choice that requires deliberation, I must consider the options before me as really open. If I thought of my course of action as already determined ahead of time, then there would not really be any choice to make. Furthermore, in taking my deliberation to be real, I also think of the possible outcomes of my actions as caused by me. The notion of a causality that originates in the self is the notion of a free will. So the very fact that I do deliberate about what actions I will take means that I am presupposing that my choice is real and hence that I am free. As Kant puts it, all practical agents act “under the idea of freedom” (4:448). It is not obvious that this argument is strong enough for Kant’s purposes. The position seems to be that I must act as though I am free, but acting as though I am free in no way entails that I really am free. At best, it seems that since I act as though I am free, I thereby must act as though morality really does obligate me. This does not establish that the moral law really does obligate me.

In the Second Critique, Kant offers a different argument for the reality of freedom. He argues that it is a brute “fact of reason” (5:31) that the categorical imperative (and so morality generally) obligates us as rational agents. In other words, all rational agents are at least implicitly conscious of the bindingness of the moral law on us. Since morality requires freedom, it follows that if morality is real, then freedom must be real too. Thus this “fact of reason” allows for an inference to the reality of freedom. Although the conclusion of this argument is stronger than the earlier argument, its premise is more controversial. For instance, it is far from obvious that all rational agents are conscious of the moral law. If they were, how come no one discovered this exact moral law before 1785 when Kant wrote the Groundwork? Equally problematic, it is not clear why this “fact of reason” should count as knowledge of the bindingness of the moral law. It may just be that we cannot help but believe that the moral law obligates us, in which case we once again end up merely acting as though we are free and as though the moral law is real.

Again, there is much debate in the literature about the structure and success of Kant’s arguments. It is clear, however, that the success of Kant’s moral project stands or falls with his arguments for freedom of the will, and that the overall strength of this theory is determined to a high degree by the epistemic status of our belief in our own freedom.

Kant’s arguments for immortality and God as postulates of practical reason presuppose that the reality of the moral law and the freedom of the will have been established, and they also depend on the principle that “‘ought’ implies ‘can’”: one cannot be obligated to do something unless the thing in question is doable. For instance, there is no sense in which I am obligated to single-handedly solve global poverty, because it is not within my power to do so. According to Kant, the ultimate aim of a rational moral agent should be to become perfectly moral. We are obligated to strive to become ever more moral. Given the “ought implies can” principle, if we ought to work towards moral perfection, then moral perfection must be possible and we can become perfect. However, Kant holds that moral perfection is something that finite rational agents such as humans can only progress towards, but not actually attain in any finite amount of time, and certainly not within any one human lifetime. Thus the moral law demands an “endless progress” towards “complete conformity of the will with the moral law” (5:122). This endless progress towards perfection can only be demanded of us if our own existence is endless. In short, one’s belief that one should strive towards moral perfection presupposes the belief in the immortality of the soul.

In addition to the “ought implies can” principle, Kant’s argument about belief in God also involves an elaboration of the notion of the “highest good” at which all moral action aims (at least indirectly). According to Kant, the highest good, that is, the most perfect possible state for a community of rational agents, is not only one in which all agents act in complete conformity with the moral law. It is also a state in which these agents are happy. Kant had argued that although everyone naturally desires to be happy, happiness is only good when one deserves to be happy. In the ideal scenario of a morally perfect community of rational agents, everyone deserves to be happy. Since a deserved happiness is a good thing, the highest good will involve a situation in which everyone acts in complete conformity with the moral law and everyone is completely happy because they deserve to be. Now since we are obligated to work towards this highest good, this complete, universal, morally justified happiness must be possible (again, because “ought” implies “can”). This is where a puzzle arises. Although happiness is connected to morality at the conceptual level when one deserves happiness, there is no natural connection between morality and happiness. Our happiness depends on the natural world (for example, whether we are healthy, whether natural disasters affect us), and the natural world operates according to laws that are completely separate from the laws of morality. Accordingly, acting morally is in general no guarantee that nature will make it possible for one to be happy. If anything, behaving morally will often decrease one’s happiness (for doing the right thing often involves doing the uncomfortable, difficult thing). And we all have plenty of empirical evidence from the world we live in that often bad things happen to good people and good things happen to bad people. Thus if the highest good (in which happiness is proportioned to virtue) is possible, then somehow there must be a way for the laws of nature to eventually lead to a situation in which happiness is proportioned to virtue. (Note that since at this point in the argument, Kant takes himself to have established immortality as a postulate of practical reason, this “eventually” may very well be far in the future). Since the laws of nature and the laws of morality are completely separate on their own, the only way that the two could come together such that happiness ends up proportioned to virtue would be if the ultimate cause and ground of nature set up the world in such a way that the laws of nature would eventually lead to the perfect state in question. Therefore, the possibility of the highest good requires the presupposition that the cause of the world is intelligent and powerful enough to set nature up in the right way, and also that it wills in accordance with justice that eventually the laws of nature will indeed lead to a state in which the happiness of rational agents is proportioned to their virtue. This intelligent, powerful, and just cause of the world is what traditionally goes by the name of “God.” Hence God is a postulate of practical reason.

6. Political Theory and Theory of Human History

Kant’s ethical theory emphasized reason, autonomy, and a respect for the humanity of others. These central aspects of his theory of individual moral choice are carried over to his theories of humanity’s history and of ideal political organization. This section covers Kant’s teleological history of the human race (6a), the basic elements of his political theory (6b), and his theory of the possibility of world peace (6c).

a. Human History and the Age of Enlightenment

Kant’s socio-political philosophy must be understood in terms of his understanding of the history of humanity, of its teleology, and in terms of his particular time and place: Europe during the Enlightenment.

In his short essay “Idea for a Universal History with a Cosmopolitan Purpose” (1784), Kant outlines a speculative sketch of humanity’s history organized around his conception of the teleology intrinsic to the species. The natural purpose of humanity is the development of reason. This development is not something that can take place in one individual lifetime, but is instead the ongoing project of humanity across the generations. Nature fosters this goal through both human physiology and human psychology. Humans have no fur, claws, or sharp teeth, and so if we are to be sheltered and fed, we must use our reason to create the tools necessary to satisfy our needs. More importantly, at the cultural level, Kant argues that human society is characterized by an “unsocial sociability”: on the one hand, humans need to live with other humans and we feel incomplete in isolation; but on the other, we frequently disagree with each other and are frustrated when others don’t agree with us on important matters. The frustration brought on by disagreement serves as an incentive to develop our capacity to reason so that we can argue persuasively and convince others to agree with us.

By means of our physiological deficiencies and our unsocial sociability, nature has nudged us, generation by generation, to develop our capacity for reason and slowly to emerge from the hazy fog of pre-history up to the present. This development is not yet complete. Kant takes stock of where we were in his day, in late 18^th c. Prussia) in his short, popular essay: “An Answer to the Question: What is Enlightenment?” (1784). To be enlightened, he argues, is to determine one’s beliefs and actions in accordance with the free use of one’s reason. The process of enlightenment is humanity’s “emergence from its self-incurred immaturity” (8:35), that is, the emergence from an uncritical reliance on the authority of others (for example, parents, monarchs, or priests). This is a slow, on-going process. Kant thought that his own age was an age of enlightenment, but not yet a fully enlightened age.

The goal of humanity is to reach a point where all interpersonal interactions are conducted in accordance with reason, and hence in accordance with the moral law (this is the idea of a kingdom of ends described in 5b above). Kant thinks that there are two significant conditions that must be in place before such an enlightened age can come to be. First, humans must live in a perfectly just society under a perfectly just constitution. Second, the nations of the world must coexist as an international federation in a state of “perpetual peace.” Some aspects of the first condition are discussed in 6b, and of the second in 6c.

b. Political Theory

Kant fullest articulation of his political theory appears in the “Doctrine of Right,” which is the first half of Metaphysics of Morals (1797). In line with his belief that a freedom grounded in rationality is what bestows dignity upon human beings, Kant organizes his theory of justice around the notion of freedom: “Any action is right if it can coexist with everyone’s freedom in accordance with a universal law, or if on its maxim the freedom of choice of each can coexist with everyone’s freedom in accordance with a universal law” (6:230). Implicit in this definition is a theory of equality: everyone should be granted the same degree of freedom. Although a state, through the passing and enforcing of laws, necessarily restricts freedom to some degree, Kant argues that this is necessary for the preservation of equality of human freedom. This is because when the freedoms of all are unchecked (for example, in the state of nature, which is also a condition of anarchy), the strong will overpower the weak and infringe on their freedoms, in which case freedoms will not be distributed equally, contrary to Kant’s basic principle of right. Hence a fair and lawful coercion that restricts freedom is consistent with and required by maximal and equal degrees of freedom for all.

Kant holds that republicanism is the ideal form of government. In a republic, voters elect representatives and these representatives decide on particular laws on behalf of the people. (Kant shows that he was not free of the prejudices of his day, and claims, with little argument, that neither women nor the poor should be full citizens with voting rights.) Representatives are duty-bound to choose these laws from the perspective of the “general will” (a term Kant borrows from Rousseau), rather than from the perspective of the interests of any one individual or group within society. Even though the entire population does not vote on each individual law, a law is said to be just only in case an entire population of rational agents could and would consent to the law. In this respect, Kant’s theory of just law is analogous to his universal law formulation of the categorical imperative: both demand that it be possible in principle for everyone to affirm the rule in question (see 5b above).

Among the freedoms that ought to be respected in a just society (republican or otherwise) are the freedom to pursue happiness in any way one chooses (so long as this pursuit does not infringe the rights of others, of course), freedom of religion, and freedom of speech. These last two are especially important to Kant and he associated them with the ongoing enlightenment of humanity in “What is Enlightenment?” He argues that it “would be a crime against human nature” (8:39) to legislate religious doctrine because doing so would be to deny to humans the very free use of reason that makes them human. Similarly, restrictions on what Kant calls the “public use of one’s reason” are contrary to the most basic teleology of the human species, namely, the development of reason. Kant himself had felt the sting of an infringement on these rights when the government of Friedrich Wilhelm II (the successor to Frederick the Great) prohibited Kant from publishing anything further on matters pertaining to religion.

c. Perpetual Peace

Kant elaborates the cosmopolitan theory first proposed in “Idea for a Universal History” in his Towards Perpetual Peace (1795). The basic idea is that world peace can be achieved only when international relations mirror, in certain respects, the relations between individuals in a just society. Just as people cannot be traded as things, so too states cannot be traded as though they were mere property. Just as individuals must respect others’ rights to free self-determination, so too, “no state shall forcibly interfere in the constitution and government of another state” (8:346). And in general, just as individuals need to arrange themselves into just societies, states, considered as individuals themselves, must arrange themselves into a global federation, a “league of nations” (8:354). Of course, until a state of perpetual peace is reached, wars will be inevitable. Even in times of wars, however, certain laws must be respected. For instance, it is never permissible for hostilities to become so violent as to undermine the possibility of a future peace treaty.

Kant argued that republicanism is especially conducive to peace, and he argued that perpetual peace would require that all states be republics. This is because the people will only consent to a war if they are willing to bear the economic burdens that war brings, and such a cost will only be worthwhile when there is a truly dire threat. If only the will of the monarch is required to go to war, since the monarch will not have to bear the full burden of the war (the cost will be distributed among the subjects), there is much less disincentive against war.

According to Kant, war is the result of an imbalance or disequilibrium in international relations. Although wars are never desirable, they lead to new conditions in international relations, and sometimes these new conditions are more balanced than the previous ones. When they are more balanced, there is less chance of new war occurring. Overall then, although the progression is messy and violent along the way, the slow march towards perpetual peace is a process in which all the states of the world slowly work towards a condition of balance and equilibrium.

7. Theory of Art and Beauty

Kant’s most worked out presentation of his views on aesthetics appears in Critique of the Power of Judgment (1790), also known as the “Third Critique.” As the title implies, Kant’s aesthetic theory is cashed out through an analysis of the operations of the faculty of judgment. That is, Kant explains what it is for something to be beautiful by explaining what goes into the judgment that something is beautiful. This section explains the structure of aesthetic judgments of the beautiful and the sublime (7a), summarizes Kant’s theory of art and the genius behind art (7b), and finally explains the connection between Kant’s aesthetic theory and his moral theory (7c).

a. The Beautiful and the Sublime

Kant holds that there are three different types of aesthetic judgments: judgments of the agreeable, of the beautiful, and of the sublime. The first is not particularly interesting, because it pertains simply to whatever objects happen to cause us (personally) pleasure or pain. There is nothing universal about such judgments. If one person finds botanical gin pleasant and another does not, there is no disagreement, simply different responses to the stimulus. Judgments of the beautiful and the sublime, however, are more interesting and worth spending some time on.

Let us consider judgments of beauty (which Kant calls “judgments of taste”) first. Kant argues that all judgments of taste involve four components, or “moments.” First, judgments of taste involve a subjective yet disinterested enjoyment. We have an appreciation for the object without desiring it. This contrasts judgments of taste from both cognitions, which represent objects as they are rather than how they affect us, and desires, which represent objects in terms of what we want. Second, judgments of taste involve universality. When we judge an object to be beautiful, implicit in the judgment is the belief that everyone should judge the object in the same way. Third, judgments of taste involve the form of purposiveness, or “purposeless purposiveness.” Beautiful objects seem to be “for” something, even though there is nothing determinate that they are for. Fourth, judgments of taste involve necessity. When presented with a beautiful object, I take it that I ought to judge it as beautiful. Taken together, the theory is this: when I judge something as beautiful, I enjoy the object without having any desires with respect to it, I believe that everyone should judge the object to be beautiful, I represent some kind of purposiveness in it, but without applying any concepts that would determine its specific purpose, and I also represent myself as being obligated to judge it to be beautiful. Judgments of beauty are thus quite peculiar. On the one hand, when we say an object is beautiful, it is not the same sort of predication as when I say something is green, is a horse, or fits in a breadbox. Yet it is not for that reason a purely subjective, personal judgment because of the necessity and intersubjective universality involved in such judgments.

A further remark is in order regarding the “form of purposiveness” in judgments of taste. Kant wants to emphasize that no determinate concepts are involved in judgments of taste, but that the “reflective” power of judgment (that is, judgment’s ability to seek to find a suitable concept to fit an object) is nevertheless very active during such judgments. When I encounter an unfamiliar object, my reflective judgment is set in motion and seeks a concept until I figure out what sort of thing the object is. When I encounter a beautiful object, the form of purposiveness in the object also sets my reflecting judgment in motion, but no determinate concept is ever found for the object. Although this might be expected to lead to frustration, Kant instead claims that it provokes a “free play” (5:217) between the imagination and understanding. Kant does not say as much about this “free play” as one would like, but the idea seems to be that since the experience is not constrained by a determinate concept that must be applied to the object, the imagination and understanding are free to give in to a lively interplay of thought and emotion in response to the object. The experience of this free play of the faculties is the part of the aesthetic experience that we take to be enjoyable.

Aside from judgments of taste, there is another important form of aesthetic experience: the experience of the sublime. According to Kant, the experience of the sublime occurs when we face things (whether natural or manmade) that dwarf the imagination and make us feel tiny and insignificant in comparison. When we face something so large that we cannot come up with a concept to adequately capture its magnitude, we experience a feeling akin to vertigo. A good example of this is the “Deep Field” photographs from the Hubble Telescope. We already have trouble comprehending the enormity of the Milky Way, but when we see an image containing thousands of other galaxies of approximately the same size, the mind cannot even hope to comprehend the immensity of what is depicted. Although this sort of experience can be disconcerting, Kant also says that a disinterested pleasure (similar to the pleasure in the beautiful) is experienced when the ideas of reason pertaining to the totality of the cosmos are brought into play. Although the understanding can have no empirical concept of such an indeterminable magnitude, reason has such an idea (in Kant’s technical sense of “idea”; see 2g above), namely, the idea of the world as an indefinitely large totality. This feeling that reason can subsume and capture even the totality of the immeasurable cosmos leads to the peculiar pleasure of the sublime.

b. Theory of Art

Both natural objects and manmade art can be judged to be beautiful. Kant suggests that natural beauties are purest, but works of art are especially interesting because they result from human genius. The following briefly summarizes Kant’s theory of art and genius.

Although art must be manmade and not natural, Kant holds that art is beautiful insofar as it imitates the beauty of nature. Specifically, a beautiful work of art must display the “form of purposiveness” (described above, 7a) that can be encountered in the natural world. What makes great art truly great, though, is that it is the result of genius in the artist. According to Kant, genius is the innate talent possessed by the exceptional, gifted individual that allows that individual to translate an intangible “aesthetic idea” into a tangible work of art. Aesthetic ideas are the counterparts to the ideas of reason (see 2g above): where ideas of reason are concepts for which no sensible intuition is adequate, aesthetic ideas are representations of the imagination for which no concept is adequate (this is in line with Kant’s claim that beauty is not determinately conceptualizable). When a genius is successful at exhibiting an aesthetic idea in a beautiful work of art, the work will provoke the “free play” of the faculties described above (7a).

Kant divides the arts into three groups: the arts of speech (rhetoric and poetry), pictorial arts (sculpture, architecture, and painting), and the art of the play of sensations (music and “the art of colors”) (5:321ff.). These can, of course, be combined together. For instance opera combines music and poetry into song, and combines this with theatre (which Kant considers a form of painting). Kant deems poetry the greatest of the arts because of its ability to stimulate the imagination and understanding and expand the mind through reflection. Music is the most successful if judged in terms of “charm and movement of the mind” (5:328), because it evokes the affect and feeling of human speech, but without being constrained by the determinate concepts of actual words. However, if the question is which art advances culture the most, Kant thinks that painting is better than music.

One consequence of Kant’s theory of art is that the contemporary notion of “conceptual art” is a contradiction in terms: if there is a specific point or message (a determinate concept) that the artist is trying to get across, then the work cannot provoke the indeterminate free play that is necessary for the experience of the beautiful. At best, such works can be interesting or provocative, but not truly beautiful and hence not truly art.

c. Relation to Moral Theory

A final important aspect of Kant’s aesthetic theory is his claim that beauty is a “symbol” of morality (5:351ff.), and aesthetic judgment thereby functions as a sort of “propaedeutic” for moral cognition. This is because certain aspects of judgments of taste (see 7a above) are analogous in important respects to moral judgments. The immediacy and disinterestedness of aesthetic appreciation corresponds to the demand that moral virtue be praised even when it does not lead to tangibly beneficial consequences: it is good in itself. The free play of the faculties involved in appreciation of the beautiful reminds one of the freedom necessary for and presupposed by morality. And the universality and necessity involved in aesthetic judgments correspond to the universality and necessity of the moral law. In short, Kant holds that a cultivated sensitivity to aesthetic pleasures helps prepare the mind for moral cognition. Aesthetic appreciation makes one sensitive to the fact that there are pleasures beyond the merely agreeable just as there are goods beyond the merely instrumental.

8. Pragmatic Anthropology

Together with a course on “physical geography” (a study of the world), Kant taught a class on “pragmatic anthropology” almost every year of his career as a university teacher. Towards the end of his career, Kant allowed his collected lecture notes for his anthropology course to be edited and published as Anthropology from a Pragmatic Point of View (1789). Anthropology, for Kant, is simply the study of human nature. Pragmatic anthropology is useful, practical knowledge that students would need in order to successfully navigate the world and get through life.

The Anthropology is interesting in two very different ways. First, Kant presents detailed discussions of his views on issues related to empirical psychology, moral psychology, and aesthetic taste that fill out and give substance to the highly abstract presentations of his writings in pure theoretical philosophy. For instance, although in the theory of experience from Critique of Pure Reason Kant argues that we need sensory intuitions in order to have empirical cognition of the world, he does not explain in any detail how our specific senses—sight, hearing, touch, taste, smell—contribute to this cognition. The Anthropology fills in a lot of this story. For instance, we learn that sight and hearing are necessary for us to represent objects as public and intersubjectively available. And we learn that touch is necessary for us to represent objects as solid, and hence as substantial. With respect to his moral theory, many of Kant’s ethical writings can give the impression that emotions and sentiments can only work against morality, and that only pure reason can incline one towards the good. In the Anthropology Kant complicates this story, informing us that nature has implanted sentiments of compassion to incline us towards the good, even in the absence of a developed reason. Once reason has been developed, it can promote an “enthusiasm of good resolution” (7:254) through attention to concrete instances of virtuous action, in which case desire can work in cooperation with reason’s moral law, not against it. Kant also supplements his moral theory through pedagogical advice about how to cultivate an inclination towards moral behavior.

The other aspect of the Anthropology (and the student transcripts of his actual lectures) that makes it so interesting is that the wealth and range of examples and discussions gives a much fuller picture of Kant the person than we can get from his more technical writings. The many examples present a picture of a man with wide-ranging opinions on all aspects of the human experience. There are discussions of dreams, humor, boredom, personality-types, facial expressions, pride and greed, gender and race issues, and more. We even get some fashion advice: it is acceptable to wear yellow under a blue coat, but gaudy to wear blue under a yellow coat. There has been a great deal of renewed interest in Kant’s anthropological writings and many commentators have been appealing to these often neglected texts as a helpful resource that provides contextualization of Kant’s more widely studied theoretical output.

9. References and Further Reading

a. Primary Literature

The best scholarly, English translations of Kant’s work are published by Cambridge University Press as the Cambridge Editions of the Works of Immanuel Kant. The following are from that collection and contain some of Kant’s most important and influential writings.

Critique of Pure Reason, trans. Paul Guyer and Allen Wood. Cambridge: Cambridge University Press, 1998.
Practical Philosophy, ed. Mary Gregor. Cambridge: Cambridge University Press, 1996. (Contains most of Kant’s ethical writings, including Groundwork for the Metaphysics of Morals, Critique of Practical Reason, and Metaphysics of Morals.)
Critique of the Power of Judgment, trans. Paul Guyer and Eric Matthews. Cambridge: Cambridge University Press, 2000.
Theoretical Philosophy 1755-1770, ed. David Walford. Cambridge: Cambridge University Press, 2002. (Contains most of Kant’s “pre-critical” writings in theoretical philosophy.)
Theoretical Philosophy after 1781, eds. Henry Allison and Peter Heath. Cambridge: Cambridge University Press, 2002 (Contains Kant’s mature writings in theoretical philosophy, including Prolegomena to Any Future Metaphysics and Metaphysical Foundations of Natural Science.)
History, Anthropology, and Education, eds. Günter Zöller and Robert Louden. . Cambridge: Cambridge University Press, 2007. (Contains, among other writings, Anthropology from a Pragmatic Point of View.)

b. Secondary Literature

Ernst Cassirer (Kant’s Life and Thought, tr. by James Haden. New Haven: Yale University Press, 1983 (originally written in 1916)) and Manfred Kuehn (Kant: A Biography. Cambridge: Cambridge University Press, 2002) both offer intellectual biographies that situate the development of Kant’s thought within the context of his life and times.
For comprehensive discussions of the metaphysics and epistemology of Critique of Pure Reason, see Paul Guyer (Kant and the Claims of Knowledge. Cambridge: Cambridge University Press, 1987), Henry Allison (Kant’s Transcendental Idealism: An Interpretation and Defense, Second Edition. New Haven: Yale University Press, 2004), and Graham Bird (The Revolutionary Kant: A Commentary on the Critique of Pure Reason. Chicago: Open Court Press, 2006).
For treatments of Kant’s ethical theory, see Allen Wood (Kant’s Ethical Thought. Cambridge: Cambridge University Press, 1999), Christine Korsgaard (Creating the Kingdom of Ends. Cambridge: Cambridge University Press, 1996), and Onora O’Neill (Constructions of Reason: Explorations of Kant’s Practical Philosophy. Cambridge: Cambridge University Press, 1990).
For analyses of Kant’s aesthetic theory (as well as other issues from the Third Critique), see Rachel Zuckert (Kant on Beauty and Biology: An Interpretation of the ‘Critique of Judgment’. Cambridge: Cambridge University Press, 2010), Paul Guyer (Kant and the Claims of Taste. Cambridge: Cambridge University Press, 1997), and Henry Allison (Kant’s Theory of Taste: A Reading of the Critique of Aesthetic Judgment. Cambridge: Cambridge University Press, 2001).
For studies of Kant’s anthropology and theory of human nature, see Patrick Frierson (What is the Human Being? London: Routledge, 2013) and Alix Cohen (Kant and the Human Sciences: Biology, Anthropology and History. London: Palgrave Macmillan, 2009).

Author Information

Tim Jankowiak
Email: timjankowiak@gmail.com
Towson University
U. S. A.

The Meaning of Life: Early Continental and Analytic Perspectives

The question of the meaning of life is one that interests philosophers and non-philosophers alike. The question itself is notoriously ambiguous and possibly vague. In asking about the meaning of life, one may be asking about the essence of life, about life’s purpose, about whether and how anything matters, or a host of other things.

Not everyone is plagued by questions about life’s meaning, but some are. The circumstances in which one does ask about life’s meaning include those in which: one is well off but bothered by either a sense of dissatisfaction or the prospect of bad things to come; one is young at heart and has a sense of wonder; one is perplexed by the discordant plurality of things and wants to find some unity in all the diversity; or one has lost faith in old values and narratives and wants to know how to live in order to have a meaningful life.

We may read our ancestors in such a way that warrants the claim that the meaning of life has been a human concern from the beginning. But it was only early in the nineteenth century that writers began to write directly about “the meaning of life.” The most significant writers were: Schopenhauer, Kierkegaard, Nietzsche, and Tolstoy. Schopenhauer ended up saying that the meaning of life is to deny it; Kierkegaard, that the meaning of life is to obey God passionately; Nietzsche, that the meaning of life is the will to power; and Tolstoy, that the meaning of life lies in a kind of irrational knowledge called “faith.”

In the twentieth century, in the Continental tradition, Heidegger held that the meaning of life is to live authentically or (alternatively) to be a guardian of the earth. Sartre espoused the view that life is meaningless but urged us nonetheless to make a free choice that would give our lives meaning and responsibility. Camus also thought that life is absurd and meaningless. The best way to cope with this fact, he held, is to live life with passion, using everything up, and with an attitude of revolt, defiance, or scorn.

In the Anglo-American tradition, William James held that life is meaningful and worth living because of a spiritual order in which we should believe, or else that it is meaningful when there is a marriage of ideals with pluck, will, and the manly virtues; Bertrand Russell argued that to live a meaningful life one must abandon private and petty interests and instead cultivate an interest in the eternal; Moritz Schlick argued that the meaning of life is to be found in play; and A. J. Ayer asserted that the question of the meaning of life is itself meaningless.

All of these set the table for a veritable feast of philosophical writing on the meaning of life that began in the 1950s with Kurt Baier’s essay “The Meaning of Life,” followed in 1970 by Richard Taylor’s influential essay on the same topic, followed shortly by Thomas Nagel’s important 1971 essay on “The Absurd.” See “Meaning of Life: The Analytic Perspective” for more on the course of the debate in analytic philosophy about the meaning of life.

Background
Nineteenth Century Philosophers
Early Twentieth Century Continental Philosophers
1. Heidegger
2. Sartre
3. Camus
Early Twentieth Century Analytic, American, and English-Language Philosophers
1. James
2. Russell
3. Schlick
4. Tagore
5. Ayer
Conclusion
References and Further Reading

1. Background

a. The Origin of the English Expression “the Meaning of Life”

The English term “meaning” dates back to the fourteenth century C.E. Its origins, according to the Oxford English Dictionary (OED), lie in the Middle English word “meenyng” (also spelled “menaynge,” “meneyng,” and “mennyng”).

In its earliest occurrences, in English original compositions as well as in English translations of earlier works, meaning is most often what, on the one hand, sentences, utterances, and stories, and, on the other hand, dreams, visions, signs, omens, and rituals have or might have. One asks about the meaning of some puzzling utterance, or of the writing on the wall, or of the vision that appeared to somebody in the night, or of the ritual performed on a hallowed occasion. Meaning is often conceived of as something non-obvious and somewhat secretive, discernible only by a seer granted with special powers.

It is much later that life is spoken of as something that might, or might not, have meaning in this sense. Such speech would have to wait upon the development of the concept of life as something like a word, a linguistic utterance, a narrative, a story, a gesture, a puzzling episode, a sign, a dream, a vision, or a surface phenomenon that points to some deep inner essence, to which it would be proper to inquire into its meaning, or to apply epithets like “meaningful” or “meaningless.” One of the earliest instances of the occurrence of the concept “life” as such a thing, as signifying something that might or might not have something like meaning, appears in Shakespeare’s Macbeth (c. 1605), where Macbeth characterizes life as “a tale told by an idiot, full of sound and fury, signifying nothing.” But notice that even here the words “meaning” and “life” are not linked.

The OED‘s definition of “meaning” in something like our sense is “The significance, purpose, underlying truth, etc., of something.” Further elaboration of early uses of the word gives us, “That which is indicated or expressed by a (supposed) symbol or symbolic action; spec. a message, warning, idea, etc., supposed to be symbolized by a dream, vision, omen, etc.” A bit later, in one of its senses, meaning takes on the sense in which it is the “signification; intention; cause, purpose; motive, justification,” . . . “[o]f an action, condition, etc.” Finally we get the sense that most nearly concerns us here: “Something which gives one a sense of purpose, value, etc., esp. of a metaphysical or spiritual kind; the (perceived) purpose of existence or of a person’s life. Freq. in the meaning of life.” (All this is from the OED.)

The first English use of the expression “the meaning of life” appeared in 1834 in Thomas Carlyle’s (1795-1881) Sartor Resartus II. ix, where Teufelsdrockh observes, “our Life is compassed round with Necessity; yet is the meaning of Life itself no other than Freedom.” The usage shortly caught on, and over the next century and a half the phrase “the meaning of life” became common. The adjective “meaningful” did not appear until 1852, the noun “meaningfulness” until 1904.

b. Questions about the Meaning of Life

The most familiar form of the question(s) about the meaning of life is simply, “What is the meaning of life?” Although the form of the question is one, when it is asked, any one (or more) of several different senses may be intended. Here are some of the more common of them.

(1) In some cases, what the seeker seeks is the kernel, the inner reality, the core, or the essence, underlying some phenomenon. Thus one might ask what his essence, his true self is, and then feel that he has found the meaning of his life if he discovers that true self.

(2) In other cases, the question is about the point, aim, object, purpose, end, or goal of life, typically one’s own. Here, in some cases, the question is about some pre-existing purpose that the questioner might (or might not) discover; in other cases, the question might be about some end or purpose the agent might invent or create and give her life. The latter questioner, when she is successful, may believe that her life has a meaning because she herself has given it one.

(3) In yet other cases, the question of the meaning of life is that of whether our lives, and anything we do within them, matter, or have any sort of importance. If one can show that they matter, and in virtue of what they do, one will have provided a substantive answer to the question of the meaning of life. A common, but not universal, assumption on this score is that our lives have significance and importance only if they issue in some lasting achievement the ravages of time will not destroy.

(4) In still other cases, what bothers the questioner is the discord, plurality, and chaotic nature of his apparent empirical life as it is actually lived. He can make no sense of it; there is no rhyme or reason to it. The drive here, one might well think, is to see one’s life as intelligible, as something that makes sense. The discovery or invention of some kind of unity in his life would amount to an answer to his question, “What is the meaning of life?”

(5) Yet another thing the question about the meaning of life can be is a request for a narrative or picture, a way of seeing life (perhaps a metaphorical one) that enables one to make sense of it and achieve a sense of meaning while living it. And so we get “Life is a bowl of cherries” and various and sundry religious narratives.

(6) Sometimes what the questioner is really wondering is whether it makes sense to go on and his question is “Is life worth living?” He may actually be contemplating suicide. His predicament has to do with meaning if he is assuming that it makes sense to continue living only if (his) life has a suitable meaning, something which, at the moment, he can’t see it as having.

(7) Finally, the question of the meaning of life can be the question of how one should live in order to have a meaningful life, or, if such a life is impossible, then what the best way to live meaninglessly is.

The seven questions just distinguished may be, but need not be, discrete and self-contained. A given seeker may very well be interested in several of them at once and see them as intimately connected. For example, a person may be interested in his core or essence because he thinks that knowledge of that may reveal the goal or purpose of his life, a purpose that makes his life seem important and intelligible, and gives him a reason for going on, as well as insight into how he must live in order to have a meaningful life. It is commonly the case that several of the questions press themselves on the seeker all at the same time.

One or more of these questions were of concern to the philosophers discussed below. Some were concerned with nearly all of them. Distinct from all the above are second-order, analytic, conceptual questions of the sort that dominate current philosophical discussion of the issue in analytic circles. These questions are not so much about the meaning of life as about the meaning of “the meaning of life” and its component concepts (“meaning,” “life”), or related ones (“meaningfulness,” “meaninglessness,” “vanity,” “absurdity,” and so forth).

c. The Broader Historical Background

Although nineteenth century thinkers were the first in the West to put the question precisely in the form “What is the meaning of life?” concern with questions in what may be called “the meaning-of-life family,” that is, ultimate questions about life, the world, existence, and its purpose may be found, in the East and the West alike, almost as far back as we can trace human thought about anything. Thus Gilgamesh (c. 2000 B.C.E.) asked why he must die; the composers of The Rig Veda (c. 1200 B.C.E.) wondered where everything came from; Job (c. 500 B.C.E.) asked why he must suffer; the ancient Taoists (Laozi c. 500 B.C.E. and Zhuangzi c. 300 B.C.E.) asked what the origin or principle of everything is, and how one must live to be in accord with it; ancient Upanishadic seekers (500-300 B.C.E.) were much vexed with the nature of the true self and its end or goal; the Buddha (c. 500 B.C.E.), before he became the Buddha, sought an understanding of life that would enable one to overcome suffering; the author of The Bhagavad Gita (c. 200 B.C.E.) was concerned, as other Indian thinkers tended to be, with the identity and nature of the true self, and also with the question of how to live; the ancient Greeks of the classical period (c. 430-320 B.C.E.) talked about the goal or end of life and how to reach it; Epicurus (341-270 B.C.E.) followed suit and developed his own unique take on these matters; Qoheleth, the author of Ecclesiastes (c. 200 B.C.E.), was struck by the vanity or futility of everything and wondered how to deal with it; Greek and Roman Hellenistic philosophers (c. 300 B.C.E. – 250 C.E.)—Epicurean, Stoic, Cynic, Skeptic, and Neo-Platonist—wondered about the good and how to achieve it; Marcus Aurelius (121-180 C.E.) mused on his cosmic insignificance.

The Christian-dominated medieval period did not produce thinkers who asked in any radical way about the meaning of life, because everyone already had a perfectly good answer, the one provided by the Christian story. Still, even in medieval times, there was room for at least three questions in the meaning-of-life family. First, there was occasion for the questions when things ran counter to the Christian story, or to what one expected. Thus Boethius (480-525) was perplexed by the deep questions when, after a life of honor, piety, and power, he fell into disgrace, had everything stripped from him unaccountably and unjustly, and found himself faced with imprisonment that lead eventually to his execution. Second, though the great Christian philosopher-theologians thought they knew the meaning of life in outline, they still asked and answered questions about the details of the final or highest good of man. Thomas Aquinas (1224-1274), for example, who accepted with unblinking assurance the general answer supplied by Christianity, found himself wondering about the exact nature of the summum bonum (the highest good) and about how to square the Christian view of it with that of Aristotle. Third, other Christian believers, medieval ones as well as present-day ones with medieval outlooks, committed to an overall view of what is going on, may be vexed by the question of what God intends for them specifically and may worry about their “calling,” the particular purpose, role, or plan God has especially for them. Hence we find confirmed believers worried deeply about the question, “What is the meaning of my life?”

In any event, since the early modern period, there has been a resurgence of interest in fundamental meaning-of-life questions. Writers as diverse as Shakespeare (1564-1616), Pascal (1623-1662), Dr. Johnson (1709-84), Kant (1724-1804), and Hegel (1770-1831) have asked, in different forms, questions about life’s ultimate point, goal, or purpose, and they are just a few of the many religious, philosophical, and literary figures who have raised and (sometimes) answered ultimate questions in the meaning-of-life family prior to Schopenhauer’s work early in the nineteenth century. There have been philosophers too since Schopenhauer’s time who have addressed the big questions, but not explicitly in terms of “the meaning of life.” This article will confine itself largely to those philosophers who have explicitly put their concerns in those terms.

The standard explanation of the rise of questions about life’s meaning in the early modern period points to three or four distinct but related things: (1) the scientific revolution; (2) the Protestant Reformation; (3) voyages and travels of exploration and discovery, in which were encountered peoples with very different outlooks on the nature of the universe and the meaning of life; and (4), as a result of all of these, the evaporation of a widely held, firmly believed Christian conception of the nature of things.

2. Nineteenth Century Philosophers

Let us turn now to the story of what philosophers from Schopenhauer in the early 1800s to Ayer and Camus in the 1940s have had to say about the meaning of life.

a. Schopenhauer

The first Western philosopher to link the ideas of life and meaning, and to ask expressly “What is the meaning of life?” was the great German pessimist Arthur Schopenhauer (1788-1860). At least he was the first to ask the question and get it noticed by other philosophers. Schopenhauer, a contemporary of Carlyle, wrote in German, in which “the meaning of life” is “der Sinn des Lebens.” Profoundly influencing the thought of both Nietzsche and Tolstoy, Schopenhauer’s work may be regarded as the springboard that launched modern Western philosophical inquiry into the problem of the meaning of life. Here is the passage in which Schopenhauer explicitly asked the question:

Since a man does not alter, and his moral character remains absolutely the same all through his life; since he must play out the part which he has received, without the least deviation from the character; since neither experience, nor philosophy, nor religion can effect any improvement in him, the question arises, What is the meaning of life at all? (1860b) [emphasis added]

The circumstances under which concern with the problem of the meaning of life were, in Schopenhauer’s case, not merely academic but real and personal. Well off financially, but struggling with personal misery and a sense of loneliness and isolation, he felt driven to find some understanding of himself and of the world around him that seemed so bleak and senseless.

Schopenhauer’s philosophy begins with a metaphysical structure he inherited from Kant and more or less simply decrees. There is a difference between the thing-in-itself and the phenomenal world of appearances. The thing-in-itself is the will to live, or, more simply, the will. It is the fundamental power and reality that underlies all things. The world we know and live in, with its stupendous abundance of things and forms, is merely the phenomena of the will, the objectification of it, its mirror, something not entirely real, or not real at all. (There is also a pure, will-less subject of knowledge whose metaphysical status is unclear: sometimes it seems to be in the very realm of the will, the realm of true reality, of things-in-themselves; at other times it seems to be something like the first creation and objectification of the will.)

The will itself just wills. It is pretty nasty, perhaps demonic. It is a blind striving, craving, and grasping, aiming at nothing in the end, except to go on willing and aggrandizing itself. It has in itself an inner contradiction, manifest in the constant struggle and strife between the billions of individual objectifications of itself in the phenomenal world. I am one such objectification; you are another. My true self, my inner essence, is the will; the same is true of you: my essence and yours are one and the same. When we fight (as we usually do), the will is engaged in a battle with itself.

The phenomenal world is an awful place. It is full of misery, pain, suffering. Little happiness is found anywhere. The twin poles of human life are pain (want, desire, stress) and boredom. Almost everyone lives a life that, from without, is meaningless and insignificant and, from within, dull and senseless.

But what is the meaning of life? The question is appropriate because life as we know it is something like Macbeth’s tale told by an idiot, a “farce.” If the question is about life’s inner essence, Schopenhauer’s answer is simply “the will-to-live.” The meaning of life is the will.

Another way of taking the question “What is the meaning of life?” is to construe it as a question about the goal, point, aim, end, or purpose of life. When Schopenhauer explicitly asks the question (in On Human Nature), it is this sense of it he appears to have in mind. His answer is depressing. The point or purpose of life is to suffer. We are being punished for the crime of being born, punished for who we are, namely, the nasty thoroughly egoistic will. The meaning of life in this sense, then, is to suffer, to be punished for our sin.

Schopenhauer suggests a number of ways of thinking about our phenomenal, experienced life. All of them are pretty bleak. He recommends that we look upon our life: as an unprofitable episode interrupting the blessed calm of nothingness; as on the whole a disappointment, nay, a cheat; as Hell, in which on the one hand men are the tormented souls and on the other the tormenting devils; as a place of atonement, a sort of penal colony; as some kind of mistake; and as a process of disillusionment. Any or all of these could be taken as answers to the question “What is the meaning of life?” (or to the question “What is life?”)

If we ask what we should do, how we can give our lives worth and meaning, Schopenhauer does have an answer. “Salvation” lies in the total denial of the will. Knowledge of the will and its horrific phenomena can and should function as a quieter of the will, bringing it to a state in which it stops willing and effectively abolishes itself. Thinking in this vein, a Schopenhauerian might say that the meaning of life is to deny, quiet, and eventually abolish the will to live that is essentially oneself.

One naturally wants to know whether this is not just suicide—whether the cessation of willing simply means that one passes into a state of nothingness. Schopenhauer’s answer is “No.” The state of the will-less individual after death seems to be nothing to us; but our present state would seem to be nothing to him. His state is wonderful and blessed, but what it is like is inconceivable to us.

In our current state, when one denies the will in herself, she does not literally commit suicide. Suicide doesn’t work because it is itself a powerful act of willing. Instead, she practices self-denial and asceticism, cultivates detachment, stops wanting and pursuing the things most people go for; and although there is still some struggle with the dying will in her, on the whole her life becomes full of peace and joy. The will is quieted and eventually abolishes itself in the individual. Very few people are capable of doing this heroic thing, Schopenhauer says, but he himself does not claim to be one of these people.

For all the darkness of his philosophy, the moral for all of us—even those of us who are not prepared to totally deny the will—which Schopenhauer derives in the end is very much in the Christian/Buddhist vein. We should not be competitive or grasping or villainous, but rather we should show compassion and kindness to everyone, since everyone is always having a bad day in this hell we are all living in, and what we all need above all are love, compassion, help, and consideration. The fundamental principle of morality, which you should follow, is: Don’t hurt anyone; help everyone you can. Following this principle, one can achieve, short of complete denial of the will, a kind of half-way salvation.

Another of Schopenhauer’s points about meaning in life should be mentioned. It is that the meaningfulness of one’s life depends not on one’s outer circumstances but rather on the way one looks at life. People look at life differently, and so the meaningfulness of her life varies considerably from person to person. To one person life is barren, dull, and superficial; to another rich, interesting, and full of meaning.

b. Kierkegaard

A major nineteenth century European philosopher who continued the tradition of thought on the meaning of life was the Danish philosopher Soren Kierkegaard (1813-1855). Kierkegaard was not an academic. The sources of his interest in problems of meaning seem to have been his not having to work for a living, his personal demons, his Nordic gloom, his congenital tendencies toward guilt, depression, anxiety, and dread, his awareness of increasing doubt all around him of the teachings of his inherited Christianity, and his agonizing failure to live up to his own Christian ideals, primarily because of his embodiment and its concomitant proclivity for the things of the flesh, especially sensuousness and sex. Out of all that emerged what appears to be a severe case of self-loathing, which in turn prompted serious inquiry into the meaning of (his) life.

It is difficult to determine what Kierkegaard’s own views were on just about everything because he constantly used humor, satire, paradox, and irony, and even more because he spoke in different voices and wrote from different perspectives under different pseudonyms.

Nonetheless, the standard view is that Kierkegaard was fundamentally a Christian. He claimed that one’s life can be meaningful and worth living only if one believes genuinely and passionately in the Christian God.

And then there is the leap. Christian belief goes beyond rational evidence, and even conflicts with it. One must make a leap from knowledge to Christian faith—the only thing in which one can find true meaning—a leap over the confines of common sense and reason. One is to accept Christian faith even if (or just because?) it is absurd. For it is the only adequate source of the kind of meaning a human being has to have to keep on going with a sense that life is worthwhile.

Another way to describe Kierkegaard’s overall philosophy is to characterize it in terms of his three stages or levels of life. One should make an ascent from the lowest stage, the aesthetic (sensuous, even sensual), through the higher ethical stage, and on to the highest stage of all, the religious, which somehow baptizes and incorporates the two lower stages into itself. Only one who has reached the religious stage can have a truly meaningful life and thus a life worth living.

Whatever Kierkegaard’s own view was, we can make the following observations about things Kierkegaard (or one or other of his pseudonymous authors) said about the meaning of life.

(1) One thing is that life can seem meaningless. In the early work, Either/Or (1843), we find this passage: “How empty and meaningless life is.” Elsewhere in Either/Or we get similar thoughts and questions, for instance, “What, if anything, is the meaning of this life?” and “My life is utterly meaningless.” Perhaps, though, the idea is that, though life is often meaningless, it need not be so, and, when it is, it is because of some kind of failure of the liver (of the life, not the organ).

(2) A second interesting idea in Kierkegaard is that meaning has something to do with unity. In a meaningful life all the diverse aspects of it come together to form some kind of coherent whole. One pursues some one goal, to which everything in one’s life is subordinated.

(3) A third point, an important one, is that, though meaning is a good thing, it is possible for there to be too much meaning in one’s life, or in its parts. Kierkegaard observes:

No part of life ought to have so much meaning for a person that he cannot forget it any moment he wants to; on the other hand, every single part of life ought to have so much meaning for a person that he can remember it at any moment. (Either/Or)

To have one’s life full of meaning to the brim, to regard life and everything one does in it as infinitely significant, brings with it so much pressure and stress that one’s life becomes unbearable.

To me [says Kierkegaard] it seems . . . that to be known in time by God makes life enormously strenuous. Everywhere where he is present each half hour is of infinite importance. Yet to live like that for sixty years is unsupportable. It is difficult enough putting up even with the three years’ hard study for an examination, and those are still not as strenuous as half an hour like this. (Concluding Unscientific Postscript)

(4) A fourth idea about meaning in Kierkegaard is the idea that one can give one’s life meaning, or that one can acquire meaning in life, by doing something like devoting oneself to something. Of Antigone he says, “her life acquires meaning for her in its devotion to showing him [her father, after his death] the last honors daily, almost hourly, by her unbroken silence.” (Either/Or)

(5) Meaning does not come from abstract, objective knowledge of any kind, whether philosophical, or scientific, or historical, or even theological. It comes from some kind of faith, a faith that is passionately acquired and lived daily.

(6) One twentieth century approach to the problem of the meaning of life is to see, accept, and bask more or less happily in the absurdity of life. Kierkegaard anticipated this approach prophetically in his characterization of the “humorist.” Kierkegaard writes: “Weary of time and its endless succession, the humorist runs away and finds humorous relief in stating the absurd.” (Concluding Unscientific Postscript)

(7) Kierkegaard’s humorist also at one point expresses a view which is surprisingly rare, namely, the view that one’s life may have a meaning, but one doesn’t know what it is. Kierkegaard writes: “[L]et a humorist say what he has in mind and he will speak, for example, as follows: What is the meaning of life? Yes, good question. How should I know?” (Concluding Unscientific Postscript)

(8) Although Kierkegaard himself was a Christian who viewed meaning as ultimately grounded in religious faith, in one’s personal relation to a supernatural God, yet, paradoxically perhaps, and certainly in an admirable spirit of non-exclusivity, he said:

It is possible both to enjoy life and to give it meaning and substance outside Christianity, just as the most famous poets and artists, the most eminent of thinkers, even men of piety, have lived outside Christianity (Concluding Unscientific Postscript).

(9) One finds in Kierkegaard the idea that life has meaning only insofar as it is related in some way to the Infinite. Nothing finite can supply the meaning of life.

On the whole, if for no other reason, Kierkegaard’s work is valuable because of its suggestiveness. Under one pseudonym or another, Kierkegaard made many important points which were taken up, or unfortunately overlooked, by subsequent philosophers concerned with the meaning of life.

c. Nietzsche

Friedrich Nietzsche (1844-1900) cut his philosophical teeth on Schopenhauer and devoted himself in his later works—from 1883 up to the onset of insanity in January 1889—to struggle with, among other things, the meaning of life.

Nietzsche’s grand project was the revaluation of all values. Part of this project was that of giving to life a new meaning. Nietzsche’s interest in the matter was not merely academic. Coming up with new values and giving life a new meaning was a project that involved a total transformation of Nietzsche’s own self, early versions of which he became dissatisfied with. One thing Nietzsche wanted to do was to produce an affirmative philosophy of life to replace Schopenhauer’s pessimistic, life-denying philosophy.

Nietzsche rejected Schopenhauer’s picture of life as suffering, or punishment for one’s sin, together with its ethic of compassion toward the poor and the sick. Such a picture belonged to a weak, sick, decadent, nay-saying mode of being in decline. Nietzsche himself wanted to produce a positive, healthy, life-affirming philosophy, one suitable for life in the ascendant.

Sometimes, particularly early in his writings, Nietzsche seemed to think some end or other is required to make things meaningful. At times, both early and late, Nietzsche spoke as though the very concept of the meaning of something is the concept of its end, object, or goal.

In other places, however, Nietzsche spoke as if the meaning of life lies in freedom from, not in the achievement of, ends. Perhaps this should be construed as the rejection of given ends to be discovered, not in the rejection of all ends, particularly those one creates. Moritz Schlick—whose thought we will consider in more detail later—claimed that Nietzsche saw that life has no meaning so long as it stands wholly under the domination of purposes. In Nietzsche’s Zarathustra, “Sir Hazard,” expressing Nietzsche’s own considered view, says, “I have saved them from the slavery of ends.” (Klemke, 3^rd ed., 63).

Nietzsche sometimes spoke as if life, before he came into it, or before he revaluated all values, had no meaning: “Sombre is human life, and as yet without meaning: a buffoon may be fateful to it” (Thus Spake Zarathustra, 1883). There is no meaning “out there” to be discovered, no meaning in the essences of things, apart from human will, desire, perspective. In fact, apart from perspective, there is no world out there at all, no “thing-in-itself,” no “facts-in-themselves.” But a psychologically strong person can do without things in themselves and meaning (already there) to be discovered in them. That is because he can organize a small part of the world himself and thus create meaning. In The Will to Power, Nietzsche speaks of “the creative strength to create meaning,” and he says:

It is a measure of the degree of strength of will to what extent one can do without meaning in things, to what extent one can endure to live in a meaningless world because one organizes a small portion of it oneself. (The Will to Power)

Whatever the meaning of life is, or is to be, it is terrestrial, not celestial. Meaning must not be placed in some fabricated “true world” but in this very earth in which we live and have our being. And the meaning of life is to be created, not discovered.

Still, somehow, man is not the meaning and measure of all things, though he has posited himself as such.

All the values by means of which we have tried so far to render the world estimable for ourselves and which then proved inapplicable and therefore devaluated the world—all these values are, psychologically considered, the results of certain perspectives of utility, designed to maintain and increase human constructs of domination—and they have been falsely projected into the essence of things. What we find here is still the hyperbolic naiveté of man: positing himself as the meaning and measure of the value of things. (The Will to Power)

The mistake lies in projecting our own values onto reality, in thinking that our meaning and values are present in things as such. But our meaning does not lie in “things-in-themselves.” It is created by us. If we then give things out there such and such a meaning, we should recognize that it is not a meaning we have found in the things themselves, but rather one that we have given them.

We can still ask, What is the meaning of life? What is the meaning we shall give to life? Nietzsche gives two different answers. One is that the meaning of life is the Übermensch (sometimes translated as ‘Superman’), Nietzsche’s post-human creator of meaning, affirmer of life, and bearer of values.

I want to teach men the sense of their existence, which is the Superman, the lightning out of the dark cloud—man. (Thus Spake Zarathustra)

The Superman is the meaning of the earth. Let your will say: The Superman SHALL BE the meaning of the earth! (Thus Spake Zarathustra)

The other answer is that the meaning of life is the will to power.

All meaning is will to power. (The Will to Power)

On the surface these two answers are different. But perhaps they are consistent. Perhaps what the will to power generates is the Superman, or what the Superman represents is the will to power. Again, perhaps the will to power is the meaning of life in the sense of its kernel or essence, while the Superman is its meaning in the sense of its end or goal.

Nietzsche’s view has some aspects or consequences that should be noted. One consequence of Nietzsche’s view is that the meaning of life is absent in the old and the sick. He acknowledged the fact. Another consequence (or perhaps component) of Nietzsche’s view is that nihilism, the denial of all value, is a transitional stage, not the finale. Yet another consequence is that the meaning of life is not about the predominance of pleasure over pain. Concern with that evidences only nihilism. Finally, it may be conjectured that Nietzsche would probably regard with scorn those of us in the current debate among academic philosophers about the meaning of life. He would consider us “minute” philosophers:

The study of the minute philosophers is only interesting for the recognition that they have reached those stages in the great edifice of philosophy where learned disquisitions for and against, where hair-splitting objections and counter-objections are the rule: and for that reason they evade the demand of every great philosophy to speak sub specie aeternitatis. (Nietzsche, 1874)

d. Tolstoy

One of the next thinkers in the Western intellectual tradition to ask seriously the question, “What is the meaning of life?” was the great Russian novelist and moralist Count Leo Tolstoy (1828-1910). He asked the question and offered part of an answer in A Confession, written in Russian in 1879, circulated in 1882, and translated and published in 1884. Tolstoy’s reflections on the question stimulated a great deal of subsequent debate on the issue.

Although characters in his earlier works, such as War and Peace, sometimes talked about the meaning of life and felt the problem deeply, Tolstoy himself raised serious questions about it only as part of a psychological crisis he underwent in the mid to late 1870s. Despite having everything anyone could ever want—wealth, fame, status, love, physical strength, and so forth—Tolstoy found himself severely disturbed. His symptoms were depression, psychological paralysis, obsession with suicide, and the continual recurrence in his head of the question of the meaning of life.

Tolstoy put his question about the meaning of life in several different ways. Here are some of them, listed in order of their occurrence in his Confession:

What is it for? What does it lead to? Why? What then? What for? But what does it matter to me? What of it? Why go on making any effort? How go on living? What will come of what I am doing today or shall do tomorrow? What will come of my whole life? Why should I live, why wish for anything, or do anything? Is there any meaning in my life that the inevitable death awaiting me does not destroy? What am I, with my desires? Why do I live? What must I do? What is the meaning of my life? Why do I exist?

Several of these seem to be quite different questions, but Tolstoy regarded them all as the same question put in different ways.

Tolstoy said explicitly that his question was not about the composition, origin, and fate of the universe, nor again about the question, “What is the life of the whole?” That question, Tolstoy said, is unanswerable for a single man, and it is “stupidity” to think an individual must first answer the question about the meaning of the universe or the whole of humanity before he can answer the question of the meaning of his own life.

Tolstoy came to think that he should not expect to find the answers to his questions in philosophy. The legitimate task of philosophy is merely to ask the question and perhaps refine and clarify it, not to answer it, which it cannot do.

This view of philosophy as incapable of providing answers to the questions of life must have been one Tolstoy came to some way into his crisis. At another point, apparently earlier, Tolstoy did try to find answers in philosophy (as well as in the mathematical, physical, biological, and social sciences). The philosophers he studied were Socrates, the Buddha, “Solomon” (the author of Ecclesiastes), and Schopenhauer.

All of these he interpreted as providing a negative answer. The gist of Socrates’ thought is that the true philosopher seeks death, because the life of the body, with all its ailments and desires, is an impediment to what he is really all about, namely, the quest for truth. The individual life of the physically discrete individual is pretty meaningless, something one would rather do without. The Buddha, as Tolstoy read him, teaches that life is the greatest of evils and works as hard as he can to free himself from it. “Solomon” teaches that it’s all “vanity.” And Schopenhauer, as Tolstoy understood him, wishes for, and advocates, annihilation.

In a nutshell, Tolstoy’s problem was this: since I will suffer, die, be forgotten, and make no difference (leave no trace) in the long run, how does my life, or anything I do, have any meaning? It was a problem he felt deeply. He had to have an answer to go on living. Tolstoy’s concern with the issue was not merely theoretical.

The solution to the problem that Tolstoy eventually came to was one he thought had been known all along by the unlearned peasants. The solution lies in a kind of irrational knowledge called “faith.” Faith is faith in God, and lived faith involves some kind of relation to the Infinite. Meaning is found in the appropriate relationship to God, the Infinite. Tolstoy’s solution bears obvious resemblances to Kierkegaard’s and is very much in the same spirit.

Tolstoy spent the rest of his life working out the details of, or variations on, this solution. The progress of his thought can be traced in What I Believe and On Life, as well as in his late short fiction (The Death of Ivan Ilych, Father Sergius, and so forth). To the end Tolstoy held that faith in God, work, service to others, unselfishness, and love are essential parts of a meaningful life. He taught that the things ordinarily pursued by many—wealth, status, power, fame—contribute nothing to the meaningfulness of life.

e. Some Common Aspects of the Lives of Schopenhauer, Kierkegaard, Nietzsche, and Tolstoy

Schopenhauer, Kierkegaard, Nietzsche, and Tolstoy all had lives which rendered them virtual breeding grounds for problems with the meaning of life. (1) All of them were well off and did not have to work for a living; there is no evidence that any of them ever felt a real threat of, say, homelessness or starvation. Nietzsche was the one that wasn’t exactly wealthy, but in his case his early retirement (in his late twenties) provided him with a pension for life sufficient to meet his material needs and free him up for a life of thought and writing. (2) All of them suffered from psychological illness of one sort or another—at the very least, a sense of gloom or melancholy, and in some cases a sense of worthlessness and a preoccupation with suicide, or feelings of dread and anxiety, or the encroachment of outright madness. (3) All of them grew up in religious environments, the tenets of which they lost faith in when they reached adulthood, and the lack of which they struggled with throughout their lives (eventually regaining, in the cases of Kierkegaard and Tolstoy, some portion of what they had lost). (4) None of them was a professional academician, except for Nietzsche in his youth.

From these four, and from our own experiences of life, we have inherited, to the extent that we have it, our preoccupation with the meaning of life.

3. Early Twentieth Century Continental Philosophers

In the early twentieth century questions about the meaning of life continued to be of interest to leading European or “Continental” philosophers.

a. Heidegger

The great German philosophy professor Martin Heidegger (1889-1976) was certainly concerned with the meaning of life. He presented two different outlooks, which we may call “early Heidegger” and “later Heidegger.”

For early Heidegger (that is, the Heidegger of Being and Time, 1927), the question of the meaning of life is the question how we can live an “authentic” life, one that is our life, not just the life for us that has been fixed by the community we live in. His answer is that to live a meaningful life is to live a life of authenticity. To live a life authenticity is to live a life that one oneself chooses, not the life that is prescribed for one by one’s social situation. To live a life of authenticity, one must have a plan, something that unifies one’s life into an organic whole. This is one’s own plan. So a meaningful life is one of focused authenticity. “Authenticity is Heidegger’s accounted of what it is to live a meaningful life.”

Living authentically, it turns out, is a matter of living in a way that is true to your heritage. “Being true to heritage is being true to your own, deepest self.” In the end, the content of authenticity is not something you freely choose ex nihilo, but rather something you discover in the conjunction of heritage and facticity.

Early Heidegger’s thought seems to be a kind of pantheism, and it is possible that Heidegger subscribed to some such view all his life.

Later Heidegger proposes a somewhat different view. In this philosophy of his, we are given the task, in which our meaning lies, of being “guardians of the world.” The world is a holy place. To understand and appreciate that fact is to exhibit not just a certain intellectual and practical stance toward the world, but to live with an attitude of respect and reverence toward the world, toward the natural world especially. Later Heidegger saw exploitation of the natural world, as in mining and highway-building, as deplorable, as contrary to the very meaning of life. The meaning of life is guardianship of the world.

b. Sartre

The French philosopher Jean-Paul Sartre (1905-1980) changed his views over the course of his life. In his work Being and Nothingness (1943), advocated an outlook from which life is absurd. We more or less seriously pursue goals which, from a detached standpoint, we can see don’t really matter. But we continue to act as though they do, and hence our lives are absurd. The Sartrean project is to overcome this detached standpoint, or to incorporate it into our lives.

The problem is other people. They insist on their own reality. They tend to get in the way of our pursuit of our own goals.

Later on, Sartre espoused a somewhat different view. On this new view, “our fundamental goal in life is to overcome our ‘contingency’,” to become the foundation of our own being. The main obstacle (again) is other people who, on the one hand, pursue their own (different) goals and, on the other, propose a real (military) threat to one’s way of life and one’s homeland.

In his 1944 play, No Exit, there is the famous line: “Hell is other people.” Other people do not cooperate with my projects, and I do not cooperate with theirs. The result is war, in something like Schopenhauer’s sense. People are always at war, or at least at odds, with each other.

In both his early and his later thought, Sartre ends up being pretty pessimistic and depressing. Life is meaningless. We can, by our free choice, give life some meaning or other. But the decision to do so is itself a matter of ungrounded free choice, which is such that it doesn’t matter whether that decision or some other one is made.

c. Camus

Albert Camus (1913-1960), a Frenchman born in Algeria, was one of the leading existentialists (though he himself disowned the label) and one of the more influential writers of the first half of the twentieth century. He was familiar with the work of Nietzsche, and greatly influenced by it.

On our theme, Camus’s starting point was the perception of the absurd. Human life, he felt, was absurd, meaningless, and senseless. The way in which it is, or the reason it is, lies in an inevitable clash between the needs and aspirations of human beings and the cold, meaningless world.

This clash has at least four facets. First, we seek—demand, even—a rational understanding of things, some way of seeing the world as familiar to us. But the world does not cooperate: to us, it is ultimately unintelligible. Second, we long for some kind of unity underlying and organizing the manifest diversity we find all around us. But again, the world is heedless of our longings. The world that presents itself to our senses is nothing but disjointed plurality. Third, we long for a higher reality (a God, for example), something transcendent, some cosmic meaning of everything. But no such meaning can be discerned. Fourth, we strive for continued life, or at least to achieve something permanent in the end. But our efforts are pointless, everything will come to nothing, and all that lies ahead is death and oblivion.

Our situation is like that of the mythical Greek of old, Sisyphus. We are condemned, as it were, to pushing a rock up a hill, over and over only to see it roll back down again, every time, when it reaches the top. Pointless labor is Sisyphus’ lot, and ours too.

The pointlessness and absurdity of life raise the question of suicide. Should we kill ourselves? Camus’s answer is that, no, we should not. Suicide is escapist. To kill yourself is to give in, to lose. If we were prisoners of war—which is something like what we are—our captor and tormentor would want us to do exactly that—confess that things are too much for us and kill ourselves. That would be his ultimate victory, which would bring him a chuckle, or perhaps even a hearty guffaw.

How then should we live? The first thing to do is to insist that life is better if there is no meaning. That would really irritate our tormentor. Second, we should cultivate a mindset of honesty and lucidity. We should not indulge in denial, or evasion, or imaginings of an eventual escape into an afterlife where everything will be put right. We should acknowledge that life is awful—but then, perhaps, add “and I love it” or “all is well.” Third, we should take up an attitude of revolt, defiance, and scorn. Camus observes, “There is no fate that cannot be surmounted by scorn.” Surely such an attitude would vex our hypothetical tormentor beyond measure. Fourth, we should live for now, stop worrying about the future, stop striving to achieve future goals. Nothing is going to come of anything we do in the long run anyway. Fifth, we should “use everything up”: work hard, play hard, approach everything with zest and passion, expend energy to the human limit. This amounts to a kind of perverse “Yes!” to life. Finally, we may ask why anyone would want to live like this? Is it something that would appeal only to the French? What are the advantages of such an attitude toward life?

Camus has answers to these queries, three in fact. First, living as he recommends is a way of salvaging our dignity, and it is a way to which a certain majesty adheres. Second, surprisingly perhaps, such a way of living brings with it a “curious joy.” Third, it is the way of freedom. Camus’s scornful existentialism is the best conception we have of a truly free human being, one who does not allow himself to be shaped and determined by the mindless, meaningless world that surrounds him.

4. Early Twentieth Century Analytic, American, and English-Language Philosophers

Anglo-American philosophers in the very late eighteenth and early twentieth centuries continued to be interested in problems of the meaning of life as well.

a. James

The American pragmatist philosopher William James (1842-1910), a Harvard professor, wrote a couple of interesting essays on our theme in the late 1890s. Both essays were written as addresses to be delivered to live audiences. They demand some discussion and consideration.

In “Is Life Worth Living?” (1895), James reveals deep, probably first-person, familiarity, with the existential source of concern with the issues of the meaning and worthwhileness of life. He calls it the “profounder bass-note of life” and suggests that it is to be found, or heard, somewhere in all of us: “In the deepest heart of all of us there is a corner in which the ultimate mystery of things works sadly.” (1895: 32)

Some people are so naturally optimistic and in love with life that they are constitutionally incapable of being much bothered by the bass-note and pay it little attention. James’s example of such a person is Walt Whitman; and one thinks of the English. James finds no fault—intellectual, moral, or otherwise—with such people. It is rare good fortune to be blessed with such a temperament. If everyone were, the question of the worthwhileness of life would never arise.

But for every Whitman, there is a suicide, and a thinker of the dreary constitution of the poet James Thomson, author of “The City of Dreadful Night.”

In his address, James imagines himself in discussion with a would-be suicide whom he tries to persuade to take up his burden and see life through to its natural end. James acknowledges that some of these suicides—perhaps the majority of them—are too far gone to have anything said to them, for instance, those whose suicidal impulses are due to insanity or sudden fits of frenzy. It is to the class of reflective would-be suicides—those disposed to kill themselves because of their thinking, reading, and brooding on the darker side of life—that James directs his remarks. It is these he wants to cheer up (or comfort) and keep alive.

James speaks of two stages of recovery from suicidal illness. The first stage includes three elements, three palliatives, for the suicidal condition. First, there is the thought, “You can end it whenever you will.” This strikes one as a strange thought to recommend to one contemplating suicide. But James thinks the thought can be a comfort. It means there’s no particular guilt or stigma attached to suicide. It means one won’t have to put up with this miserable world forever; one can opt out whenever one wants. It may delay the act by encouraging the thought, “Why kill myself today when I can always do it tomorrow?” Second, James points out, there is in human beings a natural sense of curiosity. It is worth hanging around a while longer in order to see the headlines of tomorrow’s newspaper. Third, there is a certain fighting instinct in human beings. James thinks the normal man has a reason to go on, even if the whole thing is worthless and meaningless, as long as there is some injustice to be put right, some villain to be put down, or some evil to overcome in the little corner of the universe he inhabits. The three things just mentioned all lie in the first stage of recovery, one that is partial and inferior to what lies in the second stage.

The second stage is one of full recovery. It is the religious stage. It gives one assurance of a fully worthwhile and meaningful life.

James’s injunction is to believe—to believe in a supernatural, spiritual order of things which overcomes and makes right the deficiencies of the natural order as we know it. We do not have rational or evidential proof that such a supernatural order exists. But Kant proved that natural science cannot prove that such an order does not exist. To make one’s life worthwhile and meaningful, all one has to do is to posit faith in such an order, to believe that there is a spiritual realm in which all the wrongs of the natural order are righted. In that case, one will view the natural order as an inadequate representation of the spiritual, or as a veil through which the true and wonderful nature of the spiritual is hidden or obscured.

One need have little conception of what the spiritual realm is like. The content of the belief in it can be quite minimal. All one needs to affirm is that there is such a realm and that its reality makes life worthwhile. James draws on two of the tenets of his pragmatism to support such an approach to the meaning and worthwhileness of life. One is the right to believe what we need to believe, even though it goes beyond belief warranted by empirical and rational evidence. His classic case for the right of such belief is in his essay, “The Will to Believe.”

Another tenet of pragmatism on which James draws is the idea that belief is a matter of action. To believe something is not so much to have a certain mental state as to act in a certain way. Whatever is in one’s mind, to act as though life is worthwhile and has meaning is to believe that it does

In “What Makes a Life Significant” (1899), James expressly addressed the question of the significance or meaning of life. What he said in this essay was rather different from what he had said in the previous one. The essay was in part a response to the deification of the uneducated, hard-working peasants in Tolstoy’s Confession. James admired Tolstoy a great deal but felt he went a bit overboard in his praise of peasant life and in his tendency to identify it as the very locus of meaning. James held that the lives of Tolstoy’s peasants were full of one ingredient necessary for a meaningful life—toil, struggle, pluck, will, suffering, manly virtues—but that they lacked the other necessary ingredient for a fully meaningful life, namely, what James called “ideals.”

Toward the end of the essay, James gives his own view. He states it in two or three different ways, the sense of which seems to be the same. “[I]deal visions” must be backed “with what the laborers have, the sterner stuff of manly virtue.”

[T]o redeem life from insignificance, [c]ulture and refinement all alone are not enough. . . . Ideal aspirations are not enough, when uncombined with pluck and will. . . . There must be some sort of fusion, some chemical combination among these principles, for a life objectively and thoroughly significant to result. (1899: 877)

The solid meaning of life is always the same eternal thing,—the marriage, namely, of some unhabitual ideal, however special, with some fidelity, courage, and endurance; with some man’s or woman’s pains.—And, whatever or wherever life may be, there will always be the chance for that marriage to take place. (1899: 878)

James is rather vague about what the “ideals” are, or even what they are like. In at least some cases they have something to do with culture and refinement, but it seems that they can and will vary from person to person, and may reside in some form in the uncultured and unrefined. In any event, it is noteworthy that James does not bring up the subject of religion. There is no suggestion that belief in God or a spiritual world is necessary for a fully meaningful life. An ideal wedded to manly virtue is enough.

b. Russell

The British philosopher Bertrand Russell (1872-1970) is often portrayed as one of those early twentieth century analytic philosophers who had no patience for big questions, such as that of the meaning of life. The portrayal is often reinforced by the famous story of Russell and the cab-driver, to whom Russell had nothing to say about the meaning of life.

It is true that Russell sometimes expressed a dismissive attitude toward the question: to Hugh Moorhead he said, “Unless you assume a God, the question (of life’s meaning) is meaningless” (Metz 2013b: 23), and to the taxi-driver he had indeed nothing to say about the meaning of life. But elsewhere he seems to have taken the question very seriously.

In “A Free Man’s Worship,” he begins with a fairly gloomy, despairing picture of the world science reveals to us, the only world there is, really. It is purposeless, void of meaning. The causes that produced us had no prevision of the end they were achieving. We ourselves, and everything precious to us, are the outcome of the accidental collocations of atoms. There is no life for the individual beyond the grave. The existence of our very species, along with all its achievements, will eventually be extinguished in the death of the solar system and “buried beneath the debris of a universe in ruins.”

But the thing for us to do is to maintain our ideals against the hostile universe. That universe knows the value of raw power, and not much else. Let us not worship it, as did Nietzsche. In exalting the will to power, Nietzsche was failing to maintain the highest human ideals in the face of the cruel world; he was, in a sense, giving in, capitulating, prostrately submitting to evil, sacrificing his best to Moloch.

Let us be clear-sighted and honest. Let us recognize that the facts are often bad, that in the world we know there are many things that would have been better otherwise, that our ideals are not in fact realized in the world.

But, again, in our minds and hearts, even though the whole business may be futile, let us tenaciously cling to our ideals, loving truth and beauty. Let us renounce power. Let us worship only the God created by our own love of the good. Let us live constantly in the vision of the good.

One trap we must guard against falling into is that which (Russell would think) Camus fell into some decades later. We should not cultivate and live in a spirit of fiery revolt, of fierce hatred of the senseless universe. Why not? Because indignation is still a kind of bondage, for it compels our thoughts to be occupied with the evil world. Give up the indignation so that your thoughts can be free. From freedom of thought comes art, philosophy, and the vision of beauty.

To achieve this we must develop a kind of detachment from our own personal happiness, must learn to free ourselves from the burden of concern for petty things and personal goods.

To abandon the struggle for private happiness, to expel all eagerness of temporary desire, to burn with passion for eternal things–this is emancipation, and this is the free man’s worship. (Russell 1903: 61)

In The Conquest of Happiness Russell makes a couple of remarks about the meaning of life that are worthy of note. The first is this:

The habit of looking to the future and thinking that the whole meaning of the present lies in what it will bring forth is a pernicious one. There can be no value in the whole unless there is value in the parts. Life is not to be conceived on the analogy of a melodrama in which the hero and heroine go through incredible misfortunes for which they are compensated by a happy ending. (1930: 29)

The second is odd but interesting, perhaps not the kind of thought that would occur to most people:

the human heart as modern civilisation has made it is more prone to hatred than to friendship. And it is prone to hatred because it is dissatisfied, because it feels deeply, perhaps even unconsciously, that it has somehow missed the meaning of life, that perhaps others, but not we ourselves, have secured the good things which nature offers man’s enjoyment. (1930: 75)

The thought seems to be that people hate each other because they think others have achieved (or know?) the meaning of life and they don’t. If that is true, one should be careful not to let on that he knows the meaning of life, even if he does.

Several writers have advocated focus and have thought of a life organized by one big project or goal as the paradigm case of a meaningful one. Russell rejects the idea.

All our affections are at the mercy of death, which may strike down those whom we love at any moment. It is therefore necessary that our lives should not have that narrow intensity which puts the whole meaning and purpose of our life at the mercy of accident. For all these reasons the man who pursues happiness wisely will aim at the possession of a number of subsidiary interests in addition to those central ones upon which his life is built. (1930: 177)

Finally, in “The Place of Science in a Liberal Education,” Russell makes the now familiar point that the meaning of life must come not from without but from within.

The search for an outside meaning that can compel an inner response must always be disappointed: all “meaning” must be at bottom related to our primary desires, and when they are extinct no miracle can restore to the world the value which they reflected upon it. (Mysticism and Logic, ch. 2, “The Place of Science in a Liberal Education”)

That is not to say that the meaning of life is created or chosen as opposed to discovered. For our primary desires are something largely given, something (if we are lucky) we simply find in ourselves.

c. Schlick

Moritz Schlick (1882-1936) was one of the central figures of the logical positivist movement. Thinkers in the movement are commonly said to have been dismissive of such “metaphysical” questions as that of the meaning of life. Yet Schlick for one was in no way dismissive. He described himself as a seeker of the meaning of life and wrote an extremely interesting essay on the topic in 1927.

Schlick’s contribution to the debate is (to some) one of the most appealing writings in the whole of the literature. Schlick was aware of Schopenhauer’s musings and was concerned to escape his dire conclusions. Schlick found his answer in (his interpretation of) Nietzsche’s Thus Spake Zarathustra. The answer is that life can be meaningful only if it is freed from its subjugation to ends and purposes. The suggestion is radical: a life has meaning only if it does not have some end or purpose to which everything is subordinated.

Schlick argued that the meaning of life is to be found not in work but in play. Work, in the philosophical sense, is always something done not for its own sake but for the sake of something else, some end or purpose that is to be achieved. Most often that end is the survival and perpetuation of life—that is, more work functioning only to perpetuate the life of the species. But it is absurd to take the meaning of life to lie in the continued survival of the species, or in the work required to make that survival possible. The meaning of life must lie in the content of existence, not in bare existence as such.

What then is the meaning of life? One candidate that suggests itself is feelings of pleasure and happiness. But Schlick rejects that candidate, partly on the grounds that pleasure is likely only to lead to the satiety and boredom which Schopenhauer so vividly made us aware of. Schlick also rejects the ideal of happiness as the meaning of life by way of the observation that man is essentially an active creature for which a life of idle pleasure is by no means suitable. What Schlick ends up saying is that the meaning of life is to be found in play, that is, in activity engaged in for its own glorious sake and not for the furtherance of some further end or goal. Doing something only in order to produce some further end or goal is work, and work cannot be the meaning of life. Of course, work is necessary for human existence and thriving, but it is meaningful only if it can—and it can be—turned into play, something one would do with delight even if nothing came of it in the end.

Schlick backs off from saying that the meaning of life is play. Instead, he says that the meaning of life is youth, since youth is the period of life in which play predominates. A nice consequence of this position is the fact that a life cut short in its infancy or youth is a meaningful life. If you are killed when you are ten years old, it is likely that you lived a life full of meaning.

One other aspect of Schlick’s view should be mentioned. It is that youth is not literally a matter of how long one has lived on this earth. If an old fellow turns his work into play, if he performs it primarily for the sake of the sheer joy of doing it, then he is young in the sense that matters. The key to a fully meaningful life would be to stay forever young.

d. Tagore

The Bengali Indian poet, short-story writer, novelist, dramatist, artist, sage, and philosopher Rabindranath Tagore (1861-1941), often credited with a major role in the cross-fertilization of East and West, won the Nobel Prize in literature in 1919. He wrote in English (sometimes). He knew the works of Einstein, Yeats, Wordsworth, and a host of other Western thinkers. In 1930 he delivered the Hibbert Lectures at Oxford, published the next year as The Religion of Man (1931), a remarkable volume containing much reflection on the meaning of life. This article will limit itself to consideration of a couple of points in that book.

Tagore is interesting because his interest in the question of the meaning of life did not arise out of anything like the circumstances which seemed to create the interest in so many Western thinkers. Tagore was not well-off and bored, he did not suffer from depression and existential angst, he did not worry about the importance of his personal life in the vast scheme of things, he was not a professional academic philosopher.

Tagore’s tendency was to view the question of the meaning of life as the question, “What is man?” or “What am I?” His answer seems to have been that the true human is the universal self, or the true Man represented by the life of the species, or even by the life of all beings.

If he had a problem, it lay in the chaotic, hodgepodge nature of this everyday life. Not exactly seeking for a solution to the predicament, one came to him on an ordinary day on which he was just living his everyday life in east India. He gives a gripping and poetic account of it in chapter six of The Religion of Man. He writes:

Suddenly I became conscious of a stirring of soul within me. My world of experience in a moment seemed to become lighted, and the facts that were detached and dim found a great unity of meaning. The feeling which I had was like that which a man, groping through a fog without knowing his destination, might fee when he suddenly discovers that he stands before his own house. (Tagore 1931, 95)

One thing that is noteworthy in this is that Tagore felt he had seen the meaning of life, not when he realized that his life really mattered, or added up to something sub specie aeternitatus, nor when he came up with a view of things that rid him of his angst and depression, but rather when he found that his life was part of a great unity of meaning. He saw meaning when everything, including his individual life, was one unified whole.

A second feature of Tagore’s conception of the meaning of life is the role he gives to detachment. The detachment that is relevant seems to be something like non-attachment to the petty concerns of one’s own individual life. It is not a lack of concern for anything and everything. It is lack of concern for how one’s own individual, personal life fares. The appropriately detached person places his interest in how Man as the eternal being, or beings of any sort ultimately fare. (There is an admirable concern for all life, not just human life in the thought of Tagore.) The appropriately detached man loses concern for his personal triumphs and failures and cultivates an enlivening interest in the life of the whole, with which, instead of his personal life, he identifies himself. The result is a vast increase in the sense of meaningfulness in his own life.

e. Ayer

A very different approach to the problem of the meaning of life was taken by the prominent logical positivist English philosopher A. J. Ayer (1910-1989).

Ayer argued, in an important 1947 paper, that “there is no sense in asking what is the ultimate purpose of our existence, or what is the real meaning of life” (Ayer 1947: 201). His argument is that there is no reason to believe in anything like a God who created us and intended us for a specific purpose. And even if there were such a God, his purposes could not give life meaning unless we agreed with them and accepted them. Thus the meaning of life always comes back to what we as individuals purpose, value, and aim at. There is no meaning out there to be discovered.

Ayer insists that the meaninglessness of life is nothing to cry about. One’s life has whatever meaning one gives it. It just doesn’t make sense to ask about the meaning of life because there is not, and could not be, such a thing. The question “What is the meaning of life?” is illogical and unanswerable. But a person can give his life a meaning, and if he does, it will be meaningful to him. It will come down to the value judgments the person makes. And these are a matter of personal choice and preference. There is no sense in saying that one person’s value judgments are true and another’s false. Give your life a meaning, and that’s the meaning it will have.

5. Conclusion

The dismissal of the question about the meaning of life which was characteristic of Ayer and his generation, and Camus’s idea that meaninglessness doesn’t matter, may be what ironically sparked the recent interest in the question. The natural philosophical response is that surely the question of the meaning of life is meaningful and important: in light of the remarks of Ayer, Camus, and their ilk, how is that so? A sense that the meaning of life must be a philosophical problem that matters has motivated work on the question of what the question of the meaning of life is all about, if we do not take Ayer’s dismissive attitude and Camus’s stance toward it. The work of Richard Taylor, Robert Nozick, Thomas Nagel, Joel Feinberg, Harry Frankfurt, Susan Wolf, Thaddeus Metz, Joshua Seachris, Julian Young, John Cottingham, David Benatar, and Garrett Thomson (among others) are attempts to answer this question.

The preceding survey brings us up to around 1950, just before a veritable explosion of works on the meaning of life took place in philosophy, especially in the Anglo-analytic tradition. Those interested in this explosion should begin by consulting the excellent overviews in Thaddeus Metz’s article in the Stanford Encyclopedia of Philosophy (Metz 2013) and Joshua Seachris’s article in The Internet Encyclopedia of Philosophy (Seachris 2012).

6. References and Further Reading

Ayer, A. J. “The Claims of Philosophy.” Reprinted in The Meaning of Life, 3rd Ed.. E. D. Klemke (ed.). New York: Oxford University Press, 2008: 199-202. (Originally published in 1947)
Baier, K. “The Meaning of Life.” Reprinted in The Meaning of Life. E. D. Klemke (ed.). New York: Oxford University Press, 1981: 81-117. (Originally published in 1947.)
Camus, A. “The Myth of Sisyphus.” J. O’Brien (tr.). Reprinted in part in Ways of Wisdom: Readings on the Good Life, Steve Smith (ed.). Lanham, MD: University Press of America, 1983: 244-255. (Originally published in French in 1943.)
Carlyle, T. 1834. Fraser’s Magazine. available online at Project Gutenberg.
Heidegger, M. Being and Time. J. Macquarrie and J. Robinson (trs.). Oxford: Blackwell, 1973. (Originally published in German in 1927.)
James, W. “Is Life Worth Living?.” in The Will to Believe and Other Essays in Popular Philosophy, New York: Dover Publications, 1956: 32-62. (Originally published in 1895.)
James, W. “What Makes a Life Significant?.” in On Some of Life’s Ideals. New York: Henry Holt and Company, 1899: 49–94. Reprinted in William James: Writings 1878-1899. New York: The Library of America, 1992: 861-80.
Kierkegaard, S. Concluding Unscientific Postscript. (Available free online and in several print editions.) (Originally published in Danish in 1846.)
Kierkegaard, S. Either/Or: A Fragment of Life. (Available free online and in several print editions.) (Originally published in Danish in 1843.)
Klemke, E. D. (ed.). The Meaning of Life. New York: Oxford University Press, 1981.
Klemke, E. D. (ed.). The Meaning of Life. 2nd Ed. New York: Oxford University Press, 2000.
Klemke, E. D. & Cahn, S. (eds.). The Meaning of Life: A Reader, 3rd Ed. New York: Oxford University Press, 2008.
Metz, T. “The Meaning of Life.” The Stanford Encyclopedia of Philosophy (Summer 2013 Edition). Edward N. Zalta (ed.).
Nagel, T. “The Absurd,” Reprinted in The Meaning of Life. E. D. Klemke (ed.). New York: Oxford University Press, 1981: 151-161. (Originally published in 1971.)
Nietzsche, F. Ecce Homo. (available free online and in several print editions.) (Originally written in German in 1888-1889.)
Nietzsche, F. On the Genealogy of Morals. Ian Johnston (tr.). 2009.
Nietzsche, F. Thus Spake Zarathustra. (available free online and in several print editions.) (Originally written in German in 1883-1885.)
Nietzsche, F. Twilight of the Idols. (available free online and in several print editions.) (Originally written in German in 1888-1899.)
Nietzsche, F. The Will to Power. (available free online and in several print editions.) (Originally published in German in 1901-1911.)
The Oxford English Dictionary. Oxford: Oxford University Press: 2014.
Russell, B. “A Free Man’s Worship.” Reprinted in The Meaning of Life. E. D. Klemke (ed.). New York: Oxford University Press, 1981: 55-62. (Originally published in 1903.)
Russell, B. The Conquest of Happiness. London: Liveright, 1930.
Sartre, J. P. Being and Nothingness. H. E. Barnes (tr.). New York: Philosophical Library, 1956. (Originally published in French in 1943.)
Sartre, J. P. “Existentialism and Humanism.” B. Frechtman (tr.). 1956. Reprinted in Ways of Wisdom. S. Smith (ed.). Lanham, MD: University Press of America, 1983: 234-43.
Schlick, M. 1927. “On the Meaning of Life.” Reprinted in The Meaning of Life: A Reader, 3rd Ed., E. D. Klemke & S. Cahn (eds.). P. Heath (tr.). New York: Oxford University Press, 2008: 62-71. (Originally published in 1927.)
Schopenhauer, A. 1840. On the Basis of Morality. (available free online and in several editions)
Schopenhauer, A. “On the Suffering of the World.” in Essays and Aphorisms. R. J. Hollingdale (tr.). New York: Penguin Books, 1970: 41-50. (Originally published in German in 1851.)
Schopenhauer, A. “On the Vanity of Existence.” in Essays and Aphorisms. R. J. Hollingdale (tr.). New York: Penguin Books, 1970: 51-54. (Originally published in German in 1851.)
Schopenhauer, A. “On Affirmation and Denial of the Will to Live.” in Essays and Aphorism., R. J. Hollingdale (tr.). New York: Penguin Books, 1970: 61-65. (Originally published in German in 1851.)
Schopenhauer, A. “On Suicide.” in Essays and Aphorisms. R. J. Hollingdale (tr.). New York: Penguin Books, 1970: 77-79. (Originally published in German in 1851.)
Schopenhauer, A. The Essays of Arthur Schopenhauer: The Wisdom of Life. T. B. Saunders (tr.). 1860. rpr. in The Project Gutenberg EBook of The Essays of Arthur Schopenhauer, 2004.
Schopenhauer, A. The Essays of Arthur Schopenhauer: On Human Nature. T. B. Saunders (tr.). 1860. Reprinted in The Project Gutenberg EBook of The Essays of Arthur Schopenhauer, 2004,
Schopenhauer, A. The World as Will and Representation. 2 Vols. E. F. J. Payne (tr.). 1969. New York: Dover Publications. (Vol. 1 first appeared in 1818, Vol. 2 in 1844, in German.)
Schopenhauer, A. Essays and Aphorisms, R. J. Hollingdale (tr.). 1970. New York: Penguin Books. (Originally published in German in 1851.)
Seachris, J., 2012, “Meaning of Life: The Analytic Perspective,” The Internet Encyclopedia of Philosophy,
Smith, S., (ed.), 1983, Ways of Wisdom: Readings on the Good Life, Lanham, MD: University Press of America.
Tagore, R., 1961, The Religion of Man, London: George Allen & Unwin Co., Reprinted Boston: Beacon Press. (Originally published in 1930.)
Taylor, R., 1970, “The Meaning of Life,” Reprinted in The Meaning of Life, E. D. Klemke (ed.), New York: Oxford University Press, 1981: 141-150.
Tolstoy, L., 2005, A Confession, Aylmer Maude (tr.), Reprinted Mineola, NY: Dover Publications. (Originally published in 1884.)
Young, J. 2014, The Death of God and the Meaning of Life, 2nd ed., New York & London: Routledge.

Author Information

Wendell O’Brien
Email: w.obrien@moreheadstate.edu
Morehead State University
U. S. A.

Philosophy of Mental Illness

The Philosophy of Mental Illness is an interdisciplinary field of study that combines views and methods from the philosophy of mind, psychology, neuroscience, and moral philosophy in order to analyze the nature of mental illness. Philosophers of mental illness are concerned with examining the ontological, epistemological, and normative issues arising from varying conceptions of mental illness.

Central questions within the philosophy of mental illness include: whether the concept of a mental illness can be given a scientifically adequate, value-free, specification; whether mental illnesses should be understood as a form of distinctly mental dysfunction, and whether mental illnesses are best identified as discrete mental entities with clear inclusion/exclusion criteria or as points along a continuum between the normal and the ill. Philosophers critical of the concept of mental illness argue that it is not possible to give a value-neutral specification of mental illnesses. They argue that that our concept of mental illnesses is often used to disguise the ways in which mental illness categories enforce pre-existing norms and power relations. Questions remain about the relationship between the role that values play within the concept of mental illness and how those values relate to concepts of illness more generally. Philosophers who consider themselves a part of the neurodiversity movement claim that our concept of mental illness should be revised to reflect the diverse forms of cognition that humans are capable of without stigmatizing individuals that are statistically non-normal.

There are also epistemological issues concerning the relationship between mental illness and diagnosis. Historically, the central issue centers on how nosologies (or classification-schemas) of mental illness, especially the Diagnostic and Statistical Manual of Mental Disorders (the DSM), relate mental dysfunctions with observable symptoms. Mental dysfunction, on the DSM system, is identified via the presence or absence of a set of symptoms from a checklist. Those critical of the use of behavioral symptoms to diagnose mental disorders argue that symptoms are useless without a theoretically adequate conception of what it means for a mental mechanism to function poorly. A minimal constraint on a diagnostic system is that it must be able to distinguish a person with a genuine mental illness from a person suffering from a problem with living. Critics argue that the DSM, as currently constituted, cannot do this.

Lastly, there are a host of questions surrounding the relationship between mental illness and normativity. If mental illness undermines rational agency, then there are questions about the degree to which the mentally ill are capable of autonomous decision-making. This bears on questions regarding the degree of moral and legal responsibility that the mentally ill can be assigned. Further questions about agency arise over bioethical questions about the standing of the demands made on healthcare professionals by the mentally ill. For example, individuals with Body Integrity Identity Disorder (BIID) request that surgeons amputate their healthy limbs in order to restore a balance between their internal self-representation and their external body image. Bioethicists are divided over whether the requests of patients with BIID are genuinely autonomous and deserving of assent.

Conceptions of Mental Illness
Criticisms of the Bio-psycho-social Model
Neurodiversity
1. Motivation
2. Autism, Psychopathy
Responsibility and Autonomy
1. Psychopathy
2. Body Integrity Identity Disorder and Gender Dysphoria
References and Further Reading

1. Conceptions of Mental Illness

a. Alienism and Freud

Although there are many conceptions of madness found throughout the ancient world (demon possession, divine revelation or punishment, and so forth), the conception of a distinctly mental form of illness did not fully begin to crystallize, at least in the West, until the latter half of the nineteenth-century with the creation and rise of mental asylums. Individuals who were housed in asylums were thought to be psychotic or insane. Psychotic inmates were seen as distinctly different from the non-psychotic population and this justified the creation of special purpose institutions for the containment of psychotic individuals. Psychotics were construed as suffering from distinct and localizable organic brain disorders and were treated by medical professionals known as Alienists (Elliott 2004, 471). Writing at the time, German psychiatrist Emil Kraepelin’s nosology divided psychoses into one of two types: mood disorders and demtia praexcox (Kraepelin 1896a, 1896b). All other forms of distress were though to fall outside of the province of the asylum and of medical treatment.

Non-psychotic individuals who were unhappy with their lives, who felt intense anxiety, or who might vacillate between periods of high and low-motivation were not thought to have psychotic problem. These individuals were not treated or seen by alienists but instead sought help from their family, friends, or clergy (Horwitz 2001, 40). Non-psychotic dysphoria (unhappiness) was, in this context, understood not as a distinctly medical problem but instead in a variety of other forms: a typically social problem with living, a character flaw, or simply as a different way of life. The solution for the unhappiness that many individuals suffered was not found within the asylum but instead from the family, god, or other social institutions. There was, at this time, a clear distinction between medical problems resulting in psychosis and social problems that caused suffering.

Sigmund Freud grew up in the alienist tradition and received his medical degree in 1881. Freud’s theory of the mental and of mental illness would revolutionize western understanding of psychology and would become the dominant paradigm in the psychological sciences until the middle of the twentieth-century. Where the alienists saw mental illnesses as manifestations of rather discrete brain dysfunctions, Freud would come to understand the distinction between normal persons and the mentally ill as arising from a conflict in psychological mechanisms that were a part of the normal human repertoire (Freud 1915/1977; Ghaemi 2003, 4). Where the alienist understood non-psychotic unhappiness as a problem to be solved by individuals and their support networks, Freud understood problems in living as the domain of the psychotherapist. Paul Roazen famously quotes Freud as claiming that “[t]he optimum conditions for (psychoanalysis) exist where it is not needed—that is, among the healthy” (Roazen 1992, 160).

Crucial to Freud’s reorientation of mental disorder was his view of the relationship between observable behavioral symptoms and underlying psychological disorder. Unlike Kraepelin, who understood psychotic behavioral symptoms as closely tied to specific underlying brain dysfunction, Freud did not believe that behavioral symptoms could be tied to unique disorders. The underlying source of human psychological suffering, as Freud understood it, stemmed from universal childhood experiences that if poorly resolved or understood, could manifest in adulthood as neurosis. Freud saw repression, for example, as a normal part of development from child to adult. An individual could fail to properly apply repressive techniques. If this occurs then poorly repressed trauma can manifest itself in a myriad of ways from obsessive cleaning, chronic gambling, melancholia, and so forth. (Freud 1915/1989; Horwitz 2001, 43). Simply noting melancholia in a patient would not be enough for a psychoanalyst to understand the source of repressive dysfunction.

Because a client troubled by chronic gambling and another client troubled by hysteria could, in principle, be suffering from the same underlying repressive dysfunction, any diagnostic manual based on Freud’s conception of mental disorders would not hold symptoms as fundamentally important to the diagnostic process. Instead, Freud claimed that the only way to truly understand a patient’s underlying psychological dysfunction is to acquire detailed information about a person, including his or her dreams, in order to uncover repressed sexual urges (Freud 1905/1997).

The first two editions of the DSM were largely based on Freud’s underlying theory of repression and mental disorder. This nosology would dominate western thinking about the mentally ill until the 1960s.

b. DSM I – II

When the first edition of the Diagnostic and Statistical Manual of Mental Disorders was published in 1952, psychodynamic theorists dominated the clinical and academic landscape. Nearly 2/3 of the chairs of psychology departments in American universities were chaired by psychoanalysts and the emerging DSM strongly reflects their theoretical assumptions (Strand 2011, 277). By this point, psychiatry was seen as an extension of medical practice. This required the creation of a nosology, a catalogue of disorders for clinical practice (Graham 2010, 5).

The first-edition of the DSM represented a revolutionary change in the conception and treatment of mental illness. Given the expansive notion of mental illness proposed by Freud and his students, the first two editions of the DSM conclude that many individuals that, prior to this point, were not seen as mentally ill, would benefit from therapy. Because symptoms were only weakly correlated with underlying illness on the psychodynamic view, only repeated, and intensive, conversations with a qualified analyst could help a person get to the root cause of his problems (Horwitz 2002, 45; Grob 1991, 425). The first-edition of the DSM devotes a significant proportion of its 145 pages to a classification of mental illness concepts and terms (American Psychiatric Association 1952, 73-119). Unlike future editions of the manual, illnesses are not identified in terms of a series of symptoms but instead in terms of the underlying psychological conflict responsible. For example, the manual defines Psychoneurotic Disorder as:

[T]hose disturbances in which “anxiety” is a chief characteristic, directly felt or expressed, or automatically controlled by such defenses as depression, conversion, dissociation, displacement, phobia formation, or repetitive thoughts and acts…a psychoneurotic reaction may be defined as one in which the personality, in its struggle for adjustment to internal and external stresses, utilizes the mechanisms listed above to handle the anxiety created (American Psychiatric Association 1952, 12-13).

Yet, The presence of anxiety is not sufficient to diagnose psychoneurotic disorder. Anxiety must result from an underlying conflict between the personality and other stressors. It is the role of the analyst , in this context, to discover whether this underlying conflict is present. This cannot be done by merely observing symptoms; only psychodynamic therapy can discover the true cause of a patient’s anxiety (Grob 1991, 423).

Dissent against this system of classification and diagnosis arose from many groups both external to psychiatry and internal to the psychiatric discipline; these criticisms solidified in the 1960s. The emerging “anti-psychiatry” movement would come to challenge the assumptions that had grounded psychiatric practice in the first half of the 20^th century. Conceptions of mental illness, the underlying assumptions behind the process of diagnosis, and even the status of psychiatry as a science were all subject to sustained critiques. Several of the most vocal critics of psychiatry were themselves clinical psychiatrists: R.D. Laing, David Rosenhan, and Thomas Szasz. The latter’s critique of psychiatric practice and the conceptions of mental illness are outlined in detail below in section 2(b).

Rosenahn conducted a pair of famous studies that would radically undermine the scientific legitimacy of clinical diagnosis, especially in the eyes of the public. In his initial study, Rosenhan, along with seven other volunteers, attempted to have themselves admitted several mental health institutions (Rosenhan 1973, 179-180). Rosenhan instructed his collaborators to claim that they heard a voice which said only two words: “thud” and “hollow.” For all other questions, Rosenhan instructed his subjects to answer honestly. The words ‘thud’ and ‘hollow’ were chosen specifically because they did not correspond to a known pattern of neurosis in the DSM II. Rosenhan, and all of his confederates, were admitted to mental institutions; all but one of Rosenhan’s subjects were admitted under a diagnosis of schizophrenia (Rosenhan 1973, 180). Once admitted, subjects took as long as 52 days before they were released, despite the fact that they did not play-act any symptoms of any mental illness. Rosenhan noted that once he and his confederates had been admitted, everyday behavior began to be interpreted as a sign of their underlying mental illness. Subjects who were taking notes for later use, for example, were noted as engaging in unusual “writing behavior;” subjects speaking with a psychiatrist about their childhood and family were construed as having telltale neurotic early-childhood issues (Rosenhan 1973, 183). Since these subjects were not otherwise in distress, Rosenhan claimed that the diagnostic process was not representing an underlying ‘mental illness’ in any of the pseudopatients but instead that the diagnostic process was unscientific and unfalsifiable.

Once Rosenhan publicized the results of his initial study, several institutions challenged his results by re-asserting the validity of the diagnostic process. They claimed that their institutions would not have fallen for Rosenhan’s ruse and challenged him to send pseudopatients to them for analysis. Rosenhan agreed and, despite the fact that no psuedopatients were actually sent, these institutions suspected at least 41 of their new patients (more than 20% of new patients over a three month period) of being pseudopatients sent by Rosenhan (Rosenhan 1973, 181). Again it seemed as if the diagnostic process was incapable of accurately separating the mentally ill from the healthy. In part resulting from critiques of the diagnostic process like Rosenhan’s studies, the diagnostic model of psychiatry would be radically altered. Beginning as early as 1974, the American Psychiatric Association would assign a taskforce to prepare for the publication of the next edition of the DSM. The DSM III that would result from this process, published in 1980, would represent a rejection of the psychodynamic assumptions built into the previous versions of the manual and provide a framework for all future editions of tDSM.

c. The Bio-psycho-social Model DSM III – 5

The most recent edition of the Diagnostic and Statistical Manual of Mental Disorders, the DSM 5, was published in 2013. This edition does not substantially modify the conception of mental disorder that has been offered by the manual since its third edition, first published in 1980. In comparison with the first edition of the DSM, the DSM 5 includes diagnostic criteria for more than 400 individual disorders. The conception of mental disorders used in the DSM 5 presents them as biological, psychological, or social dysfunctions in an individual; this model has, unsurprisingly come to be called the Bio-psycho-social model. It represents the current consensus view of mental disorder among psychological researchers and clinical practitioners. Psychologists disagree about whether to understand this definition conjunctively or disjunctively (Ghaemi 2007, 8). The Biopsychosocial model states:

A mental disorder is a syndrome characterized by clinically significant disturbance in an individual’s cognition, emotion regulation, or behavior that reflects a dysfunction in the psychological, biological, or developmental processes underlying mental functioning. Mental disorders are usually associated with significant distress or disability in social, occupational, or other important activities. An expectable or culturally approved response to a common stressor or loss, such as the death of a loved one, is not a mental disorder. Socially deviant behavior (e.g., political, religious, or sexual) and conflicts that are primarily between the individual and society are not mental disorders unless the deviance or conflict results from a dysfunction in the individual, as described above (American Psychiatric Association 2013, 20).

From this characterization we can extract four criteria that serve to a genuine mental disorder from other sorts of issues (problems in living, character flaws, and so forth). In order for a disturbance to be classified as a mental disorder it must:

Be a clinically significant disturbance in cognition, emotion regulation, or behavior
Reflect a dysfunction in biological, psychological, or developmental processes
Usually cause distress or disability
Not reflect a culturally approved response to a situation or event
Not result purely from a problem between an individual and her society

All of the criteria, with the exception of the ‘distress’ criterion, are individually necessary and jointly sufficient for the classification of a patient’s symptoms as stemming from a mental disorder. Prior to the seventh printing of the DSM II, homosexuality had been included as a mental disorder. The revisions to the text that took place between the DSM II and the DSM III were meant to make clear that homosexuality (“an interest in sexual relations or contact with members of the same sex”), does not satisfy the criteria for a mental disorder so long as it is not accompanied by clinically significant dysphoria (American Psychiatric Association 1973, 2). However, an individual who feels dysphoria as a result of their homosexuality can be diagnosed with an Unspecific Sexual Dysfunction in the DSM 5 (American Psychiatric Association 2013, 450).

The third, ‘distress,’ criterion is neither necessary nor sufficient to qualify a mental disturbance as a disorder. This can be seen by examining the process for the diagnosis of the ‘cluster B’ personality disorders (histrionic, anti-social, borderline, and narcissistic personality disorders). Subjects with cluster B disorders often do not suffer as a result of their condition. Indeed, those with Antisocial Personality Disorder, for example, may not see themselves as disordered and may even approve of their condition. This has led some individuals with personality disorders to align with the emerging Neurodiversity movement (see section 3 below). The patterns of behavior manifested by those with cluster B personality disorders are, nonetheless, understood as reflecting clinically significant disturbances in cognition, emotion regulation, and behavior. They form a distinct class of mental disorders in the DSM (American Psychiatric Association 2013, 645-684). Some philosophers have argued that the cluster B personality disorders should not be understood as mental disorders but instead that they are better understood as distinctly moral disorders. Louis Charland argues for this conclusion. He claims that, unlike the cluster A and C personality disorders, the only treatment for the cluster B disorders is distinctly moral improvement; because this fact about the treatment of cluster B personality disorders uniquely distinguishes them from all other mental disorders in the DSM. Thus Charland concludes that they reflect moral (as opposed to value-neutral) dysfunction (Charland 2004a, 67).

Since the publication of the DSM III, mental disorders have been defined as being caused by a clinically significant dysfunction of a mental mechanism. Because the definition of mental illness invokes the concept of dysfunction, it is often subject to critique (see the following section). Although the general definition of mental disorder used by the DSM invokes the concept of dysfunction, the diagnostic criteria for particular mental illnesses do not. It is instructive to provide an example of how particular disorders are defined within the manual. Anorexia Nervosa, for example, is defined by the presence of three clusters of behavioral symptoms (American Psychiatric Association 2013, 338-339):

A: Restriction of energy intake relative to requirements, leading to a significantly low body weight in the context of age, sex, developmental trajectory, and physical health.

B: Intense fear of gaining weight or of becoming fat, or persistent behavior that interferes with weight gain, even though at a significantly low weight

C: Disturbance in the way in which one’s body weight or shape is experienced, undue influence of body weight or shape on self-evaluation, or persistent lack of recognition of the seriousness of the current low body weight

Importantly, this characterization of Anorexia Nervosa presents the disorder as a distinct, specifiable, condition that is present in the person and that the underlying dysfunction is uniquely picked out by the presence of the behavioral symptoms identified as A and C; “B” symptoms are seen as common but not essential to diagnosis (American Psychiatric Association 2013, 340). Given the underlying conception of mental disorder offered by the authors of the DSM, Anorexia Nervosa cannot simply be the result of a conflict between the individual and society. It also must not result from an individual accurately trying to adopt social norms about beauty or appearance or diet. It must instead result from a combination of biological, psychological, and/or social dysfunctions however, the diagnostic criteria do not indicate what this underlying dysfunction consists in nor does it offer any evidence that the symptoms associated with the disorder are caused by the same underlying dysfunction.

Stemming in part for reasons of this sort, both the general bio-psycho-social model of mental disorder and the uses of the model to characterize particular disorders, like Anorexia Nervosa, have been subject to repeated criticism by philosophers.

2. Criticisms of the Bio-psycho-social Model

The definition of mental disorder that stems from the bio-psycho-social model has been subject to several criticisms. Philosophical critiques of the definition of disorder have ranged from calling for revision and specification of the concept of disorder to abandonment of the concept altogether. Many of the 400+ disorders that appear in the DSM have also been criticized. In some cases, these critiques are internal: the disorders do not appear to match the criteria of mental disorder offered in the DSM itself; in other cases, as with some critics of schizophrenia, the aim is to undermine both the existence of the disorder and the conception of mental disorder that results in its inclusion (Bentall 1990).

Many members of the antipsychiatry movement described in section 1b were responsible for setting the stage for the criticisms of the bio-psycho-social model. Although in part political, this movement saw the rise of several alternative conceptualizations of human function and dysfunction that have come to challenge the DSM’s conception of a mental disorder. Chief among these were Thomas Szasz’s influential arguments that mental illness is a ‘myth’ and the rise of ‘positive psychology’ as a viable alternative psychological ideology.

a. Mental Illness as Dysfunction

Nassir Ghaemi has criticized the current conception of mental disorder as resting on an unscientific political compromise between factions within clinical and research psychologists and to stave off the looming threat of neurobiological eliminitivism (see section 2b). Ghaemi argues that many psychologists view the Bio-psycho-social conception of mental illness disjunctively and focus predominantly on their preferred method for understanding a disorder depending on their own assumptions of dysfunction (Ghaemi 2003, 10). Although this compromise presents the appearance of consensus, Ghaemi argues that it is an illusion. He advocates for a form of integrationism about mental disorder that has become popular in some circles (Ghaemi 2003, 291; Kandel 1998, 458). A true integration of biology and psychology requires solving the currently unresolved issue over consciousness and how consciousness is realized by the brain. Because this question does not appear to be resolvable in the near-term, integrationists of Ghaemi’s stripe have offered a placeholder for a replacement to the Bio-psycho-social model instead of a true alternative to current models.

Philosophers have also criticized the DSM conception of mental disorder for its lack of a unified theory of dysfunction. The current DSM requires that mental disorders reflect a dysfunction of biological, psychological, or social mechanisms though the text itself is silent on what it would mean for a mechanism to be dysfunctional and does not provide any evidence that the symptoms used for clinical diagnosis of a disorder are caused by a single underlying dysfunction.

Philosophers have appealed to at least three distinct senses of dysfunction to craft a unified theory of mental disorder: etiological, propensity, and normative dysfunction. Etiological function (and dysfunction) is construed in evolutionary terms. A mechanism is functioning, in the etiological sense, if it evolved to serve a specific purpose and if it is, currently, serving its evolved purpose. In order to discover the function of a mental mechanism, it is necessary to discover its evolved function. Dysfunction can then be construed relative to this purpose (Wakefield 1999, 374; Boorse 1997, 12). A mechanism is dysfunctional if it is not fulfilling its evolutionary purpose. Depression, for example, may, in some cases, represent a dysfunction of a mechanism evolved for affective regulation. However, evolutionary psychological theories of mental function are still in their early stages. Furthermore, some philosophers want to allow for the possibility that many of our mental mechanisms may not have evolved to serve the functions to which we currently put them to use.

A propensity function is not constrained by past selective pressures but instead defines function and dysfunction based upon current and future selective success. Male aggression, for example, may have been adaptive in our ancestral environment and hence may represent a case of proper functioning on the etiological theory. On the propensity view, however, male aggression may not be adaptive for life in modern societies even if it was fitness-enhancing in our ancestral environments. Male aggression might therefore, on a propensity account of function and dysfunction, represent a dysfunctional mechanism and hence a mental disorder (Woolfolk 1999, 663). As with the evolutionary view, propensity function conceptions of mental dysfunction have the advantage of appealing to descriptive evidence in order to determine whether or not a specific pattern of behavior is fitness-enhancing in its current context (Boorse 1975, 52). However, crafting a theory of function and dysfunction in terms of present-day fitness appears to allow some conditions to count as mental disorders that we may be averse to label mental illnesses. One major issue with appealing to propensity function is that it appears to resurrect defunct mental illness. Drapetomania, the mental illness that was applied to runaway slaves in the nineteenth century, would appear to satisfy the definition of a propensity dysfunction. Dysphoria caused by the conditions of slavery and a strong desire to abandon one’s current condition are arguably not fitness-enhancing, in a strictly evolutionary sense, and therefore appear to satisfy the criteria for a propensity dysfunction (Woolfolk 1999, 664).

Purely normative accounts of dysfunction have not garnered much favor within the psychological or philosophical disciplines. On a purely normative account of dysfunction, a person is said to be mentally ill based upon whether or not the behavior fits within the context of a larger normative network. Whether we choose to call a person mentally ill or merely ‘bad’ may depend on whether or not we believe agents like this should be held morally responsible and the concept of responsibility may not be reducible to non-normative elements (Edwards 2009, 78). On such conceptions, it is impossible to avoid invoking evaluative concepts when describing what a mental illness is or why a particular set of behaviors is best understood as an illness (Fulford 2001, 81).

George Graham argues for what he calls an unmediated defense of realism about mental illness; Graham’s defense in unmediated in the sense that he does not believe that it must be shown that mental illnesses are natural kinds or result from brain-disorders in order to qualify as legitimate classification-independent kinds (Graham 2014, 133-134). Instead, he argues that “the very idea of a mental disorder or illness is the notion of a type of impairment or incapacity in the rational or reasons-responsive operation of one or more basic psychological faculties or capacities in persons” (Graham 2014, 135-136; see also Graham 2013a and 2013b). These capacities could be described or analyzed at various levels of implementation according to Graham though their malfunction is understood in normative terms.

Perhaps the most influential theory of dysfunction within the philosophical literature is offered by Jerome Wakefield. Wakefield’s conception of mental disorder attempts to bridge the gap between purely objective conceptions of disorder and subjective or normative views. On Wakefield’s view, a mental disorder arises only when a ‘harmful dysfunction’ is present. This combines two different types of concepts: a concept of dysfunction and a concept of harm. Wakefield’s conception of dysfunction is etiological. A mechanism is dysfunctional if it fails to perform the purpose that it evolved to perform. Etiological function is objective in the sense that etiological functions are pan-cultural: they are not dependent on cultural conceptions of function or value. They are, instead, a set of universally shared facts about human nature. The ‘harmfulness’ criterion, on the other hand, is sensitive to cultural context. (Wakefield 1992, 381; Wakefield 1999, 380). As Wakefield understands it, a person is harmed by a disorder if the disorder causes a “deprivation of benefit to a person as judged by the standards of the person’s culture” (Wakefield 1992, 384). In order to be diagnosed with a mental illness, it must be true that an agent’s behavior is caused by a malfunction of an evolutionary mental mechanism and, furthermore, it must also be true that this dysfunction, in the context of that individual’s culture, deprives her of a benefit.

Wakefield, and others like him, argue that it is crucial to distinguish between mental disorders and other sources of distress (Horwitz 1999). The crucial factor in determining proper treatment for a person’s dysphoria, these philosophers argue, is a proper identification of the cause of his or her distress. Mental disorders are caused by harmful mental dysfunctions. Other sources of distress are better understood as problems in living. Many types of unhappiness that are typically diagnosed as depression, on this view, are better understood not as stemming from depression but instead by an examination of the larger social factors that may be causing unhappiness. Because the DSM’s conception of mental disorder is cause-insensitive and identifies depression only via symptoms, it fails to distinguish between these two forms of unhappiness. The danger, these philosophers argue, is that mental disorders are construed as being problems that reside within an agent and that treatments are therefore focused only on, usually pharmaceutically aided, symptom relief. If distress has an underlying social cause, if it is a problem in living, then treatment unhappiness should have a radically different focus. For example, the symptoms described by Betty Friedan as caused by “the problem that has no name” fit relatively easily within the rubric of depression (Friedan 1963, 17). However, Wakefieldian views would resist this diagnosis. The underlying cause of the distress Freidan describes is social and the best treatment of this form of distress is social change. Sadness that is caused by patriarchal or misogynist cultures does not represent a malfunction in the evolved mechanisms in a person (it may represent just the opposite). On the DSM model, treatment may merely mask these depressive symptoms pharmacologically and would only serve to maintain the unjust social situations that give rise to it. The best understanding for “the problem that has no name” is to identify it as a problem in living stemming from misogynist assumptions about the roles available to women in a culture. Wakefield’s view is realist in the sense that its conception of mental dysfunction is independent of our acts of classification (Graham 2014, 125). Because function is grounded on etiology, there is a culturally-independent fact-of-the-matter regarding the presence or absence of a dysfunction in a person.

Wakefield’s harmfulness criterion allows for different cultures to come to different conclusions about which evolutionary dysfunctions will rightfully count as a mental disorder. On Wakefield’s view, homosexuality may represent a genuine evolutionary dysfunction (in the sense that exclusive homosexual behavior threatens the propagation of genes into future generations) but homosexuality is not harmful in a contemporary broadly Western cultural-context. Because it is not harmful in this cultural-context, it is a mistake to think of homosexuality as a disorder. This leaves open the possibility that the harmfulness criterion would allow homosexuality to be a legitimate mental disorder in other cultural-contexts.

Other critics have assailed Wakefield’s appeal to etiological dysfunction. Aside from the general epistemological problem that results from identifying the evolutionary function of psychological mechanisms, there are two problems that arise with an appeal to etiological dysfunction. First, some have argued that depression is an evolved response and hence could not be construed as a mental disorder on Wakefield’s view (Bentall 1992, 96; Woolfolk 1999, 660). Second, some have argued that many of our mental mechanisms may not have arisen as a result of evolutionary selection pressures. They may be evolutionary “spandrels” in Stephen Gould’s sense. The white color of bones necessarily results from the composition of bone but is itself not a property explicitly selected in an evolutionary sense. A spandrel cannot dysfunction in Wakefield’s terms because it lacks an evolutionary cause for its existence. Although spandrels can confer adaptive advantages, they are importantly not themselves traits that are selected for. If any of our mental mechanisms are spandrels then Wakefield’s view cannot explain disorders arising from their use (Gould and Lewontin 1979, 581; Woolfolk 1999, 664, Zachar 2014, 120). Famously, some philosophers have argued that complex human abilities, like our capacity for language may themselves be evolutionary spandrels (Chomsky 1988; Lilienfeld and Marino 1995, 413). Furthermore, recent critics have suggested that too much of the recent work on mental illness has focused exclusively on elucidating the concept of illness or dysfunction and have neglected to consider how advances within the philosophy of mind and the cognitive sciences might change our conception of the ‘mental’ component of mental illness (Brülde and Radovic 2006, 99).

Philosophers who are critical of attempts to define a distinctly mental conception of disorder have been motivated, in part due to the arguments above, to move in two different directions. Some have proposed that we replace the concept of mental disorder with a strictly neurological conception of dysfunction. Doing so, they argue, would place disorders on a clearer and more scientific footing.

b. Neurobiological Eliminitivism

The transition from the DSM II to DSM III brought with it the adoption of the biomedical model for diagnosis. Unlike the psychodynamic model, which saw symptoms as providing little insight into the underlying cause of distress, the biomedical model afforded symptoms pride of place in diagnosis. For much of the 20^th century, the biomedical model of diagnosis understood the symptoms that a patient brought to her clinician as providing insight into the underlying disorder(s) that caused her patient to consult the clinician in the first place.

Psychology, as a therapeutic discipline, adopted this model of diagnosis and, in the process, began to categorized patient symptoms into discrete groupings, each caused by a specific mental disorder. However, some philosophers have noted that the biomedical model itself has changed rapidly in the 21^st century and that this has created a dilemma for clinical psychological models of diagnosis. Patient reports, in current biomedical models of diagnosis, have lost their pride of place as the key markers for diagnosis. In their place clinicians turn to laboratory test results to determine the true illness responsible for a patient’s suffering. One motivation for this change, within general clinical practice, is that symptoms underdetermine diagnosis. Adopting this new biomedical model for mental illnesses, however, has been seen by some as presenting an eliminitivist threat to mental disorders (Broome and Bortolotti 2009, 27).

Eliminative materialism arose in the 20^th century in order to challenge to views about the mind that assign mental states explanatory/causal roles. The views targeted by the eliminitivist were grounded in common-sense or “folk” ideas about everyday mental states like beliefs and desires. These views situated mental states as entities belonging to proper scientific explanation. Eliminitivists argued that folk psychological theories of the mind would fare no better than our folk biological or physical theories and that the folk mental states should be eliminated from scientific explanations (Churchland 1981). Mature cognitive and neuro-sciences do not need to make reference to folk psychological states like beliefs and desires in order to explain human behavior; furthermore, the neural architecture of the brain itself does not appear to house discrete localizable states like beliefs and desires that are assumed by folk psychology (Ramsey, Stich and Garon 1990). Folk psychological theories tell us that the best explanation of human behavior (including mental illness) should be given in terms of dysfunctional mental states (delusions, compulsive desires, etc.). The eliminitivist, on the other hand, undermines this view by claiming that nothing in the brain corresponds to these folk-psychological states and that we are better off without appealing to them.

Eliminitive materialism has arisen as a challenge to the DSM construal of mental disorders in the form of cognitive neuropsychology. “This process may start as a process of reduction (from the disorder behaviorally defined to its neurobiological bases), but in the end psychiatry as we know it will not just be given solid scientific foundations by being reduced to neurobiology; it will disappear altogether” (Broome and Bortolotti 2009, 27). Just as biomedical diagnosis has shifted away from patient report toward more direct assessments using bio-physiological metrics, the eliminitivist argues that the same process should occur with mental disorders. Neurological dysfunction should supplant folk psychological discussions of mental dysfunction. In much the same way as Alzheimer’s disease is understood as a neurological brain disorder; the eliminitivist claims that a mature cognitive neuroscience will replace contemporary classifications of mental disorders with neurological dysfunction (Roberson and Mucke 2006, 781).

Philosophers who resist the eliminitivist reduction of the mental to the neurological argue that at least some types of mental disorders cannot be understood without appealing to mental states. Plausible candidates for this type of disorder include delusions (Broome and Bortolotti 2009, 30), personality disorders (Charland 2004a 70) and various sexual disorders (Soble 2004, 56; Goldman 2002, 40). Personality disorders, especially those falling under the category of ‘Cluster-B’ disorders, appear to require that individuals have acquired bad characters in order to accurately explain why the behavior stemming from the illness is disordered. If normative competence necessarily makes reference to belief-forming mechanisms (having knowledge about moral concepts, recognition of the agency of other persons, etc.) then Cluster-B personality disorders cannot be fully reduced to their neurobiological underpinnings without a meaningful loss of the disordered element of the disorder (Pickard 2011, 182).

On a related note, philosophers have attempted to resist the purely mechanistic neuro-scientific explanations of psychology. Jeffrey Poland and Barbara Von Eckardt argue that the DSM’s bio-psycho-social model relies on a mechanistic model of mental illness but that purely mechanistic models fail to explain the representational aspects of a mental illness; in their words “[a]ny such account will extend well beyond what one would naturally assume to be the mechanism of (or the breakdown of the mechanism of) the cognition or behavior in question” (Von Eckardt and Poland 2004 982). Peter Zachar argues for a view he calls the Imperfect Community Model. This model is based on a rejection of essentialism grounded in pragmatism; Zachar argues that mental illnesses are united as a class despite lacking any necessary and sufficient conditions to define them; mental disorders bear a prototypical or family resemblance to one another, however, that suggests a rough unity to the concept (Zachar 2014, 121-8).

c. The Role of Value

There are related questions that arise about the nature and role of value and mental illness. The first has to do with whether mental illness is a value-neutral concept. Nosologies of mental illness attempt to create value-neutral definitions of the disorders they contain. In the ideal, the concepts picked out by manuals like the DSM are supposed to reflect an underlying universal human reality. The mental disorders contained therein are, with only minor exception, not meant to represent culturally relative normative value judgments onto the domain of the mental.

The DSM includes a “cultural formulation” section meant to distinguish culturally specific, explicitly normative disorders from the supposed pan-cultural, value-neutral disorders that make up the bulk of the manual (American Psychiatric Association 2013, 749). In part this approach stems from the idea that psychologists adhering to the bio-psycho-social model of mental disorders view their project as being on par with nosologies of non-mental disorders. There are two questions worth raising here. The first is whether or not this “likeness argument” has any merit, the second is whether or not the biomedical illness concept is, itself, value-neutral (Pickering 2003, 244). A heart attack, for example, is a disorder, on this model, no matter the time or location of the infarction. Heart attacks are, in this sense, natural kinds and proper objects for scientific study. A heart attack represents a particular form of cardiovascular dysfunction that is agnostic about the cultural or moral values of a particular community. Despite the fact that heart attacks may not present the same symptoms across different sufferers (some may grab their left arms, some may scream, some may fall to the ground, etc), what unites these heterogeneous seeming symptoms is an underlying causal story that explains them (Boyd 1991, 127). Mental disorders are thought to operate on the same principle. On the one hand, the view that psychological symptoms are united by a common cause may result from pre-theoretical assumptions about mental states (Murphy 2014 111-121). Critics of the bio-psycho-social model argue that values are an essential component of the concept of mental illness. If values are an ineliminable part of the concept of mental illness, we should be led to ask what kinds of values are invoked by the concept

Michel Foucault was an early critic of mental illness and mental health institutions. In his Madness and Civilization: A History of Insanity in the Age of Reason, Foucault argued that asylums, being institutions where ‘the mad’ were separated from the rest of society, emerged historically by the application of models of rationality that privileged individuals already in power. This model served to exclude many members of society from the circle of rational agency. Asylums functioned as a place for society to house these undesirable persons and to reinforce pre-existing power relations; cures, when available, represented conformity to existing power structures (Foucault 1961/1988). Foucault’s critique of mental disorder inspired a generation of psychologists, many of which see themselves as part of a new counter-movement from within the discipline: the Positive Psychology movement. The constructivist and value-laden interpretation of the DSM’s bio-psycho-social model of mental disorder has led some within this movement to call for the abandonment of the model. There is an intrinsic problem, they argue, with viewing individuals as, primarily, vehicles of dysfunction. Those within the positive psychology movement argue that a new, openly value-laden, conception of human beings should supplant the manual: “[t]he illness ideology’s conception of “mental disorder” and the various specific DSM categories of mental disorders are not reflections and mappings of psychological facts about people. Instead, they are social artifacts that serve the same sociocultural goals as our constructions of race, gender, social class, and sexual orientation—that of maintaining and expanding the power of certain individuals and institutions and maintaining social order as defined by those in power” (Maddux 2001, 15).

Hybrid views, like those of Jerome Wakefield, which attempt to delineate a value-neutral and a value-laden component to the concept of mental illness have also been subject to criticism for the role they assign value. Richard Bentall, for example, has argued that the supposedly objective components of these theories contain value-laden assumptions. Bentall argues that happiness satisfies the objective criteria for mental dysfunction (happiness is a rare mental state, it impairs judgment and decision making, and its neural correlates are at least partially well-understood); however, happiness is not viewed as a dysfunction (and consequently is not categorized as a mental illness) because we value the state for its own sake (Bentall 1999, 97). This view is echoed by constructivists about mental illness.

Constructivists about mental illness can hold a variety of positions about where the concept of social construction operates with regards to mental illness. At the least radical level, constructivists can hold that cultures impose models of ideal agency that are used to label sets of human behaviors as instances of ordered and disordered agency; behavioral syndromes, on this view, can be more or less pan-cultural though each culture develops a theory of ideal agency that renders some of these syndromes ‘illnesses’ while other cultures may group the syndromes differently according to different values (Sam and Moreira 2012). A more thorough-going constructivism understands these packages or syndromes of behavior as themselves objects of constructivism; for example, the set of behaviors currently associated with depression would not be seen as a natural (categorization-independent) grouping of properties. Instead, the set of behaviors we call ‘depressive’ exist only because they have been grouped together by clinicians (for any number of reasons) (Church 2001, 396-397). This form of constructivism claims that the only way to explain why a set of behaviors, feelings, thoughts, and so forth, are grouped into a syndrome is that clinicians have created this grouping. Unlike the set of behaviors characteristic of a heart attack, for which we have a readily available causal story that unifies them, mental illnesses lack a clinician-independent explanation for their grouping. On this view, syndromes are akin to what Ian Hacking has called “interactive kinds” (Hacking 1995, Hacking 1999). For Hacking, while natural kinds represent judgment-independent groupings in the world, an interactive kind “when known, by people or those around them, and put to work in institutions, change the ways in which individuals experience themselves—and may even lead people to evolve their feelings and behaviors in part because they are so classified” (Hacking 1999, 103). To think of mental illnesses, like multiple personality disorder (now Dissociative Identity Disorder), as an interactive kind is to say that multiple personality disorder is not a basic fact about human neurology discoverable by the neuroscientist; instead, once the concept of multiple personality disorder is identified, once a set of behaviors has come to be seen as a manifestation of the condition and clinicians have been trained to identify and treat it, then individuals will begin to understand themselves in terms of the new concept and behave accordingly. Some have argued that many paraphilias and personality disorders are best understood on the interactive kind model (Soble 2004, 60; Charland 2004a, 70).

Critics will note that the natural kind -the socially constructed kind- distinction does note exhaust the alternatives. According to Nick Haslam, the natural kind distinction is tacitly invoked by realists of mental illness; this distinction, however, masks several possible alternative accounts of mental illness that allow for intermediate, less essentialist, even pluralist views (Haslam 2014, 13-20; see also Murphy 2014, 109).

d. Szasz’s Myth of Mental Illness

Perhaps the best-known critic of mental illness to arise out of the anti-psychiatry movement of the 1960’s is Thomas Szasz. He published The Myth of Mental Illness in 1961 initiating a wide-ranging discussion of how best to understand the concept of a mental illness and its relation to physical illnesses. Szasz’s work was (and continues to be) the subject of significant discussion and debate. Szasz’s main claim is that the psychiatric field, and its concomitant conception of a mental illness, rests “on a serious, albeit simple, error: it rests on mistaking or confusing what is real with what is simulation; literal meaning with metaphorical meaning; medicine with morals…mental illness is a metaphorical disease” (Szasz 1974/1962, x). Mental illness should be understood as a metaphorical disease, according to Szasz, because it results from clinicians making a kind of category mistake. It involves the use of concepts derived from one disciplinary body, medicine and the natural sciences, and applying them to a realm where they do not rightfully apply: human agency (Cresswell, 24).

According to Szasz, the proper world-view of the natural sciences is to construe its objects of study as law-like and deterministic. All knowledge in this domain is thought to be reducible to, and explainable in terms of, physicalism. Medicine, being a branch of science, understands medical illness on this model. A malfunctioning heart-valve has characteristic physical discontinuity with a functional one, it has typical effects on the function of the valve, and these effects are identifiable independent of patient symptoms. The treatment for medical illnesses relies on a thoroughly physicalist picture of the workings of the human body. Szasz believed that adopting the concept of a physical illness into the realm of mental illness is fundamentally incompatible with our concept of human agency. This results from two lines of argument. The first is that mental illnesses, unlike physical ones, are not typically reducible to biophysical causes (Szasz 1979, 22). If biological dysfunction cannot be used as a basis for delimiting mental illness then the only option left is to appeal to non-normative behavior. Szasz’s second concern is similar to the worries of neurobiological elimintivism mentioned in section 2(b). Szasz argues that the eliminitivist’s picture of human agency is, at best, incomplete. The root of the problem stems from the fact that Szasz believes that we must view agents as necessarily free, capable of choice, and as responsible; “in behavioral science the logic of physicalism is patently false: it neglects the differences between persons and things and the effects of language on each” (Szasz 1974, 187). Szasz’s argument here is sometimes construed as an appeal to dualism. The physical world is deterministic but the mental world must necessarily be free. Because the bio-psycho-social model uses concepts derived from natural sciences in a realm where they do not rightfully apply (that is, human agency) mental illness, as a concept derived from the natural sciences, is a myth resulting from this category mistake. To say that mental illness is a myth, however, is not meant as a denigration of individuals who suffer. It is, instead, meant to more accurately categorize their suffering as resulting from a failure to conform to social, legal, or ethical norms (Pickard 2009, 85).

Szasz’s critics have responded along several lines. Some do not take issue with his underlying understanding of the illness concept but disagree with his claim that it is not applicable to mental phenomena. Mental illnesses, according to these critics, have been (or will soon be) reducible to neurological or neurochemical dysfunction. They argue that advances in neuroscience give us reason for thinking that the prospect for finding the neurological or neurochemical correlates for at least some of our mental illnesses categories is high (Bentall 2004, 307). Other critics have argued instead in the other direction and attacked Szasz’s construal of physical illness. Szasz’s arguments have been taken, by some, to imply that physical illness itself is a deeply evaluated category reflective of value-judgments in much the same way mental illness is meant to on Szasz’s account (Fulford 2004; Kendell 2004). Still others have aimed to preserve Szasz’s primary claim that the overarching category of ‘mental illness’ will prove to be a non-natural interactive-kind, reflective of our values and practices, while simultaneously maintaining that “particular kinds of mental illnesses may yet constitute valid scientific kinds” (Pickard 2009, 88).

3. Neurodiversity

Human cognitive and physical functions range widely across the species. Although most individuals fall within a statistically normal range in terms of their abilities in all of these arenas, statistical normalcy has long been criticized as a normative marker (Daniels 2007, 37-46). Advocates for what has come to be known as the ‘neurodiversity movement’ have begun, in part stemming from the criticisms of psychiatry and the DSM begun in the 1960’s, to push for widespread acceptance of the forms of cognition beyond the “neuro-normal” that individuals operate with (Hererra 2013, 11). Members of the neurodiversity movement understand it as “associated with the struggle for the civil rights of all those diagnosed with neurological or neurodevelopmental disorders (Fenton and Krahn, 2007, 1). Forms of cognition currently seen as dysfunctional, ill, or disordered are better understood as representing diverse ways of seeing and understanding the space of reasons. Proponents of neurodiversity claim that agents on the autism spectrum, those with personality disorders, attention deficit and hyperactivity disorder, dyslexia, and perhaps even those with psychopathic traits should not suffer from the stigma associated with the illness label. Individuals to whom these label apply often demonstrate profound capabilities (artistic, mathematic, and scientific) that are inseparable from the condition underlying their illness-label (Glannon 2007, 3; Ghaemi 2011). Pluralism about forms of human agency should be encouraged once we fully understand the problematic ways in which norms have come to influence illness categories.

a. Motivation

Applying the label “mentally ill” or “disordered” can have long-term negative effects not only by affecting how individuals to whom we apply the label view themselves (Charland 2004b, 338-340; Rosenhan 1973, 256) but also by affecting how others view and treat them (Didlake and Fordham 2013, 101). Often, the decision to create a new mentally ill class is decided without the consultation of the groups involved. Homosexuality, for example, had been labeled a mental disorder in the first two editions of the DSM until social and political movements, largely headed by homosexuals themselves, caused the American Psychiatric Association to re-assess its stance (Bayer and Spitzer 1982, 32). The effects that being labeled mentally ill or disordered have on persons are wide-ranging and durable enough to warrant caution; those in the neurodiversity movement argue, from various perspectives, that clinicians continue to mistake diverse forms of cognition (variations from the neuro-normal) with mental illness because of the assumption, which advocates argue is mistaken, that deviation from statistically-normal neural-development and function constitutes disorder. Advocates for neurodiversity typically argue along two lines. The first is to argue that our current concepts of mental dysfunction are in need of revision because they contain one or more of the problems described in section 2 of this entry. This line of argument focuses especially on issues over the role of power and value in the construction of mental illness categories. The second line of argument is “firmly grounded in motivations of an egalitarian nature that seek to re-weight the interests of minorities so that they receive just consideration with the analogous interests of those currently privileged by extant social institutions” (Fenton and Krahn 2007, 1). Any resulting account of neurodiversity must aim to preserve useful categories of illness or mental disorder (if only for the purposes of treatment).

Perhaps the most forceful arguments from the neurodiversity perspective target the status of autism as a form of mental disorder. Much controversy has followed the APA’s decision to fold the diagnosis of Asperger’s syndrome into the more general category of Autism Spectrum Disorder.

b. Autism, Psychopathy

Autism Spectrum Disorder is the diagnosis applied to a wide-ranging number of individuals who have demonstrated persistent difficulty with social understanding and communication and whose symptoms emerge quite early in development. For example, the DSM-5 lists “[i]mpairment of the ability to change communication to match context or the needs of the listener,” “[d]ifficulties following rules for conversation and storytelling,” and “[d]ifficulties understanding what is not explicitly stated (e.g., making inferences) and nonliteral or ambiguous meanings of language” as diagnostic for ASD (American Psychiatric Association 2013, 50-51). Advocates for neurodiversity argue that it is unjust to attempt to force those with ASD to modify their behavior in order to more closely match neurotypical behavior especially as a form of treatment for a disease or disorder. For example, efforts to “change the diets of people with ASD, force them to inhale oxytocin, and expose children to countless hours of floor time or social stories to try to make persons with ASD more like neurotypicals” fail to realize that these attempts at changing individual cognition imposes a narrow conception of proper functioning as a form of treatment. Furthermore, treatments whose aim is to reduce ASD symptoms, some argue, resemble arguments made by those wishing to eradicate other minority-cultures defined by functioning (that is, deaf-communities) (Barnbaum 2013, 134). Some individuals with ASD argue that they constitute their own unique culture that deserves respect (Glannon 2007, 2). Advocates for neurodiversity argue that conceptions of mental illness that include ASD assume that deviation from neurotypical function is evidence of mental dysfunction rather than a sign of the forms of neurodiversity present in any human population. Autistic flourishing must be understood as being different from (though not a degenerate form of) neurotypical flourishing. Equally important within the call to neurodiversity is the project to identify and articulate the ways that social institutions are built around and advantage persons of “neurotypical” function over others (Nadesan 2005, 30). Given the proper account of functional agency, many individuals with ASD should be seen as functional and not disordered or mentally ill. Although not as common, similar arguments are sometimes advanced for other mental disorders including psychopathy.

Psychopathy is a controversial construct. As currently understood, it is a spectrum-disorder and is diagnosed using the revised version of what is known as the “Psychopathy Checklist” (PCL-R). Importantly, psychopathy does not appear in any version of the DSM as a distinct disorder. In its place, the DSM offers Antisocial Personality Disorder (ASPD). ASPD is intended as an equivalent diagnosis, though there is significant evidence that ASPD and Psychopathy are distinct (Gurley 2009, 289; Ramirez 2013, 221-223). Psychopathy, discussed in more detail in section 4a, is characterized by an inability to feel empathic distress (to find the suffering of others painful) along with a pronounced difficulty in understanding the differences between norms that are purely conventional versus other types of norms (Dolan and Fullam 2010, 995). Beyond these symptoms, however, psychopathy is characterizable as a distinct form of agency that raises concern about neurodiversity. Some psychopaths are ‘successful’ in the sense that they avoid incarceration while satisfying PCL-R diagnostic criteria. Psychopaths of this sort are much more likely to be found in corporate and other institutional settings (academia and legal, medical, or corporate professions) (Babiak 2010, 174). In these contexts, some have argued that psychopathic personality traits should be seen as virtues (Anton 2013, 123-125). A more contextual understanding of psychopathy as a distinct way of relating to reasons, persons, and situations may lead us to appreciate the distinct contributions persons with these traits can make. Psychopathy, especially the effects that psychopathy has on emotional and moral competence, has raised challenges to traditional theories of moral responsibility.

4. Responsibility and Autonomy

Accounts of mental illness are closely tied to accounts of agency and responsibility. It is not unusual, following an especially horrific crime, for public discourse to include questions about a suspect’s mental health history and whether a suspect’s alleged mental illness should excuse them from responsibility. Eric Harris, one of the teens responsible for the Columbine High School massacre, was called a psychopath by psychologist Robert Hare (Cullen); media commentators noted that Adam Lanza, the man responsible for killing 26 at Sandy Hook Elementary School in Connecticut had been diagnosed with autism and raised questions about the role this may have played (Lysiak and Hutchinson). One reason why discussions like these happen so quickly after a crime likely has to do with the relationship between mental illness and the effects that mental illness are thought to have on responsibility. One view on the matter states that “[t]o diagnose someone as mentally ill is to declare that the person is entitled to adopt the sick role and that we should respond as though the person is a passive victim of the condition. Thus, the distinguishing features of dysfunction that we should look for are not a universally consistent set of exclusive qualities, but things that provide the grounds for the normative claim made by applying the label ‘mental illness’” (Edwards 2009, 80). A more careful analysis of the relationship between mental illness and theories of moral responsibility indicates that several factors are often thought to matter when it comes to holding a person with a mental illness responsible for what s/he has done.

a. Psychopathy

Philosophical theories of moral responsibility often make a distinction between two different aspects of responsibility: attributability and accountability (Watson 1996, 228). Attributability refers to all of the capacities that someone must have in order to be responsible. One minimal condition may be that an action is attributable to a person if it stems from her agency in the right sort of way. Accidental muscle spasms, for example, are not typically attributable to an agent.

If we are dealing with an agent that has satisfied these attributability conditions, we can ask further questions about how we should treat this person after she has acted. This is a question about accountability. Some philosophers have claimed that there are many different forms of accountability, each requiring its own justification (Fischer and Tognazzini 2012, 390). It is one thing to make sure that I intentionally made the rude comment at dinner, it is another to decide what should be done to me as a result. The former is a question about attributability, the latter is a question about accountability.

Emotional capacities form an important component of many theories of moral responsibility (Fischer and Ravizza 1999; Strawson 1962; Wallace 1994; Brink and Nelkin 2013). Reactive attitude theories give moral emotions a central location within a conception of attributability and accountability. The term ‘reactive attitude’ was originally coined by Peter Strawson as a way to refer to the emotional responses that operate in the context of responding (that is, reacting) to what people do (Strawson, 1962). Resentment, indignation, disgust, guilt, hatred, love, and shame (and potentially many others) are reactive attitudes. For Strawson, and philosophers who have followed him, to respond to a person’s action with one of these reactive attitudes is to simultaneously hold him accountable. A theory of moral attributability could be derived, in principle, via an examination of the conditions under which we believe it to be appropriate to respond to someone with a reactive attitude.

Reactive attitudes focus on the quality of their target’s will. What this means is that our reactive emotions are sensitive to facts about an agent’s intentions, desires, her receptivity to reason, and so forth. Philosophers refer to this as the Quality of Will Thesis. Reactive attitude theorists explain excuses and an exemption from responsibility by analyzing how an agent’s will affects our attitudes. Legitimate excuses, for example, lead us to believe that we should extinguish our reactive response to a person. Excuses, in effect, show us that we were wrong about the quality of a target’s will (Wallace 1994, 136-147). If you push me and I fall, I might resent you; however, if I realize that you pushed me in order to save me from oncoming traffic, my attitude will be modified. My resentment will have been extinguished and the pushing has been excused. Excuses inform us that we were mistaken about what action was done. Excuses are singular events, they do not cast doubt on a person’s agency, their attributability, but instead inform us that we were wrong about what intention/purpose we attributed to them. Agents that appear to be universally excused are more traditionally said to be exempt from responsibility.

An exemption occurs when we are led to question whether a person meets our attributability requirements. Imagine again that I am knocked over except this time I learn that the person who pushed me suffers from significant and persistent psychotic delusions. She believed, in that moment, that I was a member of the reptilian illuminati and pushing me would get the grey aliens to repossess her hated neighbor’s house. Unlike a case involving excuse, a person whose agency is hampered by delusions as severe as these is not a proper target for our reactive attitudes at all (Strawson, 1962; Broome and Bartolotti, 2009, 30). Agency as abnormal as this is better seen as exempt from judgments of attributability or accountability. Exempt agents are not true sources of their actions because exempt agents lack the ability to regulate their behavior in an intelligibly rational way (Wallace 1994, 166-180). It would not be appropriate to resent these agents.

The logic of excuses and exemptions has been thought to show that responsible agency requires that a responsible agent have epistemic access to moral reasons along with the ability to understand how these reasons fit together (Fischer and Ravizza 1997). Furthermore, some have proposed that an agent must have the opportunity to avoid wrongdoing (Shoemaker 2011, 6). Psychopaths seem to be rational and mentally ill at the same time; because of these features, they create difficulty for many theories of responsibility.

Perhaps the most notable diagnostic feature shared by psychopaths is an inability to feel empathic distress. You feel empathic distress when you are pained by the perception of others in pain. The processes that ground empathic distress are not thought to be under conscious control. Psychopaths do not respond as most people do when exposed to signs of others in pain (Patrick, Bradley and Lang 1993) Although the degree to which someone can have the capacity for empathic distress varies, psychopaths are significantly different from non-psychopaths (Flor et.al., 2002).

Furthermore psychopaths have significant difficulty distinguishing between different types of norms. Psychologists have noted that most people are readily able to note the difference between a violation of moral norms from violations of conventional norms (Dolan and Fullam 2010). Normal persons tend to characterize moral norms as serious, harm-based, not dependent on authority, and generalizable beyond their present context; conventional norms are characterized as dependent on authority and contextual (Turiel 1977). Children began to mark the distinction between moral and conventional norms at around two years of age (Turiel 1977). Psychopaths, on the other hand, fail to consistently or clearly note the differences between them. Most psychopaths tend to treat all norms as norms of convention. Non-psychopaths note a difference between punching someone (a paradigmatic moral norm violation) and failing to respond in the third-person to a formal invitation (a violation of a conventional norm). Although there is significant controversy about how much we can infer from the psychopath’s inability to mark the ‘moral / conventional’ distinction, the inability, along with their previously noted empathic deficit, has led some philosophers to argue that psychopaths cause problems for traditional theories of moral responsibility(Turiel 1977).

Reactive attitude theorists have argued that psychopaths should be exempt or excused from moral responsibility on both epistemic and fairness grounds. Given their difficulty distinguishing between moral and conventional norms, many reactive attitude theorists conclude that psychopaths are not properly sensitive to moral reasons and cannot be fairly held accountable (Fischer and Ravizza 1998; Wallace 1994; Russell 2004). It would be unfair to hold someone morally responsible if they cannot understand moral reasons; it is therefore inappropriate to express reactive attitudes at psychopaths (Fischer and Ravizza 1998, 78-79). However, some have argued that psychopathic agency can ground accountability ascriptions.

David Shoemaker, for example, has argued that: “[a]s long as [the psychopath] has sufﬁcient cognitive development to come to an abstract understanding of what the laws are and what the penalties are for violating them, it seems clear that he could arrive at the conclusion that [criminal] actions are not worth pursuing for purely prudential reasons, say. And with this capacity in place, he is eligible for criminal responsibility” (Shoemaker 2011, 119). Although Shoemaker’s claim about legal responsibility has struck many as correct, the larger debate is over whether psychopaths are morally responsible for their choices given what we know about psychopathic agency.

If moral responsibility requires the capacity to understand moral reasons as distinctly moral and if, as many philosophers have supposed, this capacity is grounded on the ability to empathize with others, then psychopaths cannot understand moral reasons and should be excused. This puts pressure on Shoemaker’s characterization of psychopathic responsibility. If a psychopath’s understanding of moral reasons can be gauged by, for example, their poor ability to distinguishing moral norms from conventional norms then this also appears to be evidence for their lack of receptivity to moral reasons. Some philosophers have excused psychopaths for just this reason: “[c]ertain psychopaths…are not capable of recognizing…that there are moral reasons…this sort of individual is not appropriately receptive to reasons, on our account, and thus is not a morally responsible agent” (Fischer and Ravizza 1998, 79). Others, like Patricia Greenspan, have argued that psychopaths do have a form of moral disability, stemming from their emotional impairments, but that this form of disability should serve to mitigate, not extinguish, their responsibility (Greenspan 2003, 437).

Some philosophers note the consequences of psychopathic moral receptivity on the quality of will thesis. If reactive attitudes are sensitive to the quality of an agent’s will, then psychopaths cannot express immoral wills if they do not understand morality. If psychopaths cannot act on a will that merits reactive accountability then they lack attributability altogether. Jay Wallace has argued that “[w]hat makes it appropriate to exempt the psychopath from accountability…is the fact that psychopathy…disables an agent’s capacities for reflective self control” (Wallace 1994, 178).

Others argue that psychopaths may be held accountable by appealing to non-moral reactive attitudes like hatred, disgust or contempt. These attitudes, they claim, can be targeted at the quality of a psychopath’s will even if it is granted that they cannot act on immoral wills (Talbert 2012, 100). This is true even if the psychopath cannot appreciate that we have moral reasons for caring about our status as agents. Insofar as the psychopath can make judgments like these, then, in the words of Patricia Greenspan, “[the psychopath] is a fair target of resentment for any harm attributable to his intention to the extent that the reaction is appropriate to his nature and deeds. He need not be “ultimately” responsible in the sense that implies freedom to escape blame” (Greenspan 2003, 427). Because psychopaths are incapable of understanding moral reasons it is unfair to hold them morally responsible but there are forms of accountability and reactive address that are outside the moral sphere that may remain appropriate to direct at them.

Shame, in particular, appears to be a normatively significant reactive attitude that psychopaths have access (Ramirez 2013, 232). Shame grounds a family of retributive forms of accountability and has been though to serve as another way to hold psychopaths accountable even if it can be established that psychopaths are not capable of feeling or understanding moral reactive attitudes. If psychopaths are susceptible to shame then they can be fairly held accountable on shame-based grounds.

It is fair to hold psychopaths accountable in these non-moral (shame-based) ways based if they are able to feel the emotion being levied against them and can express a quality of will that these attitudes are sensitive to. More importantly, although psychopaths do not understand the distinctiveness and weight of moral reasons, their judgments can still express condemnable attitudes about those reasons. Greenspan notes that all of us have “blind spots” about certain narrow classes of reasons and we stand to those reasons in the same relation that psychopaths stand to moral reasons; these blind spots don’t excuse us from accountability (Greenspan 2003, 435).

b. Body Integrity Identity Disorder and Gender Dysphoria

Conceptions of mental illness, and mentally impaired agency, factor prominently over questions regarding the best way to treat a disorder. In 1997, Robert Smith, a surgeon at the Falkirk and District Royal Infirmary in Scotland, amputated one of this patient’s limbs at this patient’s request. The limb itself was healthy. There did not exist any medical justification for the amputation. In 1999, Smith amputated another patient’s healthy limb, again at the request of the patient, and was scheduled to perform a third amputation (on a third patient) before the hospital’s board of directors forbade him from amputating any more healthy limbs. Smith’s patients came to him with a set of symptoms that do not correspond to any particular disorder in the DSM. Smith’s patients were not under the delusion that their limbs did not belong to them; they did not see their limbs as disfigured or disgusting. Instead, his patients claimed that, from a young age, they had not thought of the limb as part of their authentic selves. They were, the patients claimed, never meant to be born with the limb and were seeking surgery to allow their inner representation of their bodily identity to match their external body presentation. The only way to do this was to amputate their healthy limb.

Patients who seek to radically alter their body via repeated surgeries or extreme dieting are ordinarily (barring other symptoms) diagnosed with Body Dysmorphic Disorder (BDD). BDD, however, requires that patients seek to modify their bodies because they find a specific part of their body disgusting or revolting or flawed. Patients with BDD also tend to engage in obsessive behaviors related to the body-part’s appearance (grooming, ‘mirror checking,’ etc) (APA 2013, 248). Smith’s patients, although they claimed to experience significant dysphoria because of their condition, did not do so because they found their limbs revolting or disfigured. They identified themselves as having a different condition: Body Integrity Identity Disorder. Like psychopathy, BIID is not a disorder cataloged in the DSM. Although BIID is not a DSM disorder, the APA does recognize that it appears distinct from BDD. “Body Integrity Identity disorder (apotemnophilia)…involves a desire to have a limb amputated to correct an experience of mismatch between a person’s sense of body identity and his or her actual anatomy. However, the concern does not focus on the limb’s appearance, as it would be in body dysmorphic disorder” (APA 2013, 246-247). Vilayanur Ramachandran and Paul McGeoch claim that they have discovered several of the neural correlates of BIID and these appear distinct from BDD; specifically, they claim that the disorder arises in part from a dysfunction of the right parietal lobe (Ramachandran and McGeoch 2007, 252).

Apart from the conceptual question over whether BDD and BIID are underlying manifestations of the same mental illness, individuals who claim to suffer from BIID raise significant ethical questions over the nature of mental illness, autonomy, and surgical treatments for dysphoria. Patients with BIID request that surgeons recognize and grant their request for surgical intervention to cure psychological suffering. Although the case of BIID has not received widespread philosophical attention, several different approaches have been advanced with regards to BIID patient requests for amputation. The purpose of these amputations is, they claim, to correct what they see as a mismatch between their inner and outer selves. Some philosophers have raised doubts about the ability of BIID patients to act on genuinely autonomous decisions (Mueller 2009, 35). One worry about challenging the autonomy of otherwise rational agents is that, in other domains, we appear to allow individuals significant freedom to modify their bodies for many reasons (aesthetic, political, self-expression, and so forth) without thereby questioning their status as autonomous agents (Bridy 2004). The right to bodily autonomy is typically construed as one of the guiding values in biomedical decision-making. Furthermore, BIID sufferers who have their requests for amputation denied often resort to self-harm. Many will harm their limbs to the point where amputation becomes medically necessary. Some have argued that it is morally permissible to grant BIID requests for amputation on the basis of harm-prevention (Bayne and Levy 2005, 78). Others have expressed concern over the use of surgical treatments for mental illnesses (if it is granted that BIID is a mental illness), given that the surgery persons with BIID are requesting involve the permanent removal of a capacity typically thought to important (Johnston and Elliot 2002, 430).

Given that BIID patients appear to have a locatable dysfunction in their temporal lobes (an area where internal body representations are thought to be located), some philosophers have argued that surgical treatments are unjustified if a non-surgical solution can be found. That is, if BIID results from the suffering that is caused by a mismatch between a patient’s internal representation of herself and her outer presentation, then if it possible to change the inner representation, and thereby evade surgery, and thus we have an obligation to ought to do so (Johnston and Elliot 2002, 432). This approach, however, forces us to confront philosophical responses to other conditions that involve mismatches between a person’s inner representation of their bodies and their external bodily presentation. In particular, patients with BIID argue that their condition is analogous to the suffering faced by those with gender dysphoria. These individuals often seek sexual reassignment surgery to alleviate their perceived embodiment mismatch (Bayne and Levy 2005, 80). Individuals who are suffering as a result of their assigned sex/gender and who exhibit a strong desire to alter their sex and gender characteristics can be diagnosed with Gender Dysphoria (APA 2013, 451-459). Unlike other patients desiring surgical body modification (for self-expression, to meet unrealistic gender ideals, and so forth), individuals with BIID or Gender Dysphoria both report that their desires for surgical alteration of their body presentation originate at a young age. Both groups seek to have their request for surgical alteration respected by those around them as a recognition of their autonomy and of the value that gender (or bodily integrity) play in the formation of an authentic self (Lombardi 2001, 870).

The discussion of BIID, its status as a mental disorder, and the ethics of granting a person’s request for amputation are all relatively new and hotly debated topics within the Philosophy of Mental Illness and Bioethics generally. This debate is, however, connected to a larger, better established, questions concerning patient autonomy and what it means for an agent to make autonomous choices. At the moment there does not exist a clear-consensus on the status of BIID as disorder or a received view on how to treat BIID requests for amputation.

5. References and Further Reading

American Psychiatric Association. (1952). Diagnostic and statistical Manual of Mental Disorders Washington, DC.
American Psychiatric Association. (1973). “Homosexuality and Sexual Orientation Disturbance: Proposed Change in DSM-II, 6th Printing, page 44 POSITION STATEMENT (RETIRED).” Arlington VA.
American Psychiatric Association. (2013). Diagnostic and statistical Manual of Mental Disorders 5th ed. Washington, DC.
Anton, Audrey L. (2013) “The Virtue of Psychopathy: How to Appreciate the Neurodiversity of Psychopaths and Sociopaths Without Becoming A Victim.” Ethics and Neurodiversity Cambridge Scholars Publishing: Newcastle upon Tyne: 111-130.
Babiak P., Neumann C., and Hare R.D. (2010). “Corporate Psychopathy: Talking the Walk.” Behavioral Sciences and the Law 28(2): 174-193.
Barnabaum, Deborah. (2013). “The Neurodiverse and the Neurotypical: Still Talking Across an Ethical Divide.” Ethics and Neurodiversity Cambridge Scholars Publishing: Newcastle upon Tyne: 131-145.
Bayer,Ronald and Robert L. Spitzer. (1983). “Edited correspondence on the status of homosexuality in DSM-III.” Journal of the History of the Behavioral Sciences Vol. 18(1): 32–52.
Bayne, Tim and Neil Levy. (2005). “Amputees By Choice: Body Integrity Identity Disorder and the Ethics of Amputation.” Journal of Applied Philosophy 22(1): 75-86.
Bentall, Richard. (1990). “The Syndromes and Symptoms of Psychosis: Or why you can’t play ‘twenty questions’ with the concept of schizophrenia and hope to win.” Reconstructing Schizophrenia Routledge: London.
Bentall, Richard. (1992). “A Proposal to Classify Happiness as A Mental Disorder.” Journal of Medical Ethics 18(2): 94-98.
Bentall, Richard. (2004). “Sideshow? Schizophrenia construed by Szasz and the neoKrapelinians.” In J.A. Schaler (Ed.) Szasz under Fire: The Psychiatric Abolitionist Faces His Critics. Peru, Illinois: Open Court.
Boorse, C. (1975). “On the distinction between disease and illness.” Philosophy and Public Affairs, 5: 49-68.
Boorse, C. (1997). “A rebuttal on health.” In J.M. Humber and R.F. Almeder (eds.), What Is Disease? Totowa N.J.: Humana Press: 1-134.
Boyd, Richard. (1991). “Realism, antifoundationalism, and the entuhusiasm for natural kinds.” Philosophical Studies 61: 127-148.
Broome, Matthew and Lisa Bortolotti. (2009). “Mental Illness as Mental: In Defense of Psychology Realism.” Humana Mente 11: 25-44.
Bridy, A. (2004). “Confounding extremities: Surgery at the medico- ethical limits of self-modiﬁcation.” Journal of Law, Medicine and Ethics 32(1): 148–158.
Brink, David and Dana Nelkin. (2013). “Fairness and the Architecture of Responsibility.” In David Shoemaker (Ed). Oxford Studies in Agency and Responsibility Volume 1. Oxford University Press.
Brülde, B., and F. Radovic. (2006). “What is mental about mental disorder?” Philosophy, Psychiatry, & Psychology 13(2): 99–116.
Charland, Louis. (2004a). “Character Moral Treatment and Personality Disorders.” Philosophy of Psychiatry. Oxford University Press: 64-77.
Charland, Louis. (2004b). “A Madness for Identity: Psychiatric Labels, Consumer Autonomy, and the Perils of the Internet.” Philosophy, Psychiatry, and Psychology 11(4): 335-349.
Chomsky, Noam. (1988). Language and Problems of Knowledge: The Managua Lectures. Cambridge, Mass. / London, England: MIT Press (Current Studies in Linguistics Series 16).
Church, Jennifer. (2004). “Social Constructionist Models” The Philosophy of Psychiatry Oxford University Press: 393-406.
Churchland, P. M., (1981). “Eliminative Materialism and the Propositional Attitudes,” Journal of Philosophy 78: 67–90.
Cresswell, Mark. (2008). “Szasz and His Interlocutors: Reconsidering Thomas Szasz’s “Myth of Mental Illness” Thesis” Journal for the Theory of Social Behavior 38(1): 23-44.
Cullen, Dave. (2004). “The Depressive and the Psychopath: At last we know why the Columbine killers did it.” Slate. Web. April 2004.
Daniels, Norman. (2007). Just Health: Meeting Health Needs Fairly. Cambridge University Press: NY.
Dolan, M.C., Fullam, R.S. (2010). “Moral/conventional Transgression Distinction and Psychopathy in Conduct Disordered Adolescent Offenders.” Personality and Individual Differences Vol. 49: 995–1000.
Edwards, Craig. (2009). “Ethical Decisions in the Classification of Mental Conditions as Mental Illness.” Philosophy, Psychiatry, and Psychology 16(1): 73-90.
Elliott, Carl. (2004). “Mental Illness and Its Limits” The Philosophy of Psychiatry Oxford University Press: 426-436.
Fenton, Andrew and Tim Krahn. (2007). “Autism, Neurodiversity and Equality Beyond the ‘Normal’” Journal of Ethics in Mental Health 2(2): 1-6.
Fischer J.M., Ravizza M. (1998). Responsibility and Control: A Theory of Moral Responsibility. New York: Cambridge University Press.
Fischer J.M., Tognazzini N.A. (2011). “The Physiognomy of Responsibility.” Philosophy and Phenomenological Research 82(2): 381-417.
Freud, Sigmund. (1905/1997). Dora: An Analysis of a Case of Hysteria. Simon and Schuster: NY.
Freud, Sigmund. (1915-1917 / 1977). Introductory Lectures on Psychoanalysis. W.W. Norton and Company: NY.
Friedan, Betty. (1963). The Feminine Mystique. W.W. Norton and Company: NY.
Foucault, Michel. (1961/1988). Madness and Civilization: A History of Insanity in the Age of Reason. Random House: NY.
Fulford, K.W.M. .(2001). “What is (mental) disease?: An open letter to Christopher Boorse.” Journal of Medical Ethics 27(2): 80–85.
Fulford, K.W.M. (2004). “Values Based Medicine: Thomas Szasz’s Legacy to Twenty-First Century Psychiatry.” In J.A. Schaler (Ed.) Szasz under Fire: The Psychiatric Abolitionist Faces His Critics. Peru, Illinois: Open Court.
Ghaemi, Nassir. (2003). The Concepts of Psychiatry Johns Hopkins University Press.
Ghaemi, Nassir. (2011). A First Rate Madness. Penguin Press: NY.
Glannon, Walter. (2007). “Neurodiversity” Journal of Ethics in Mental Health 2(2): 1-5.
Goldman, Alan. (2002). “Plain Sex.” In Alan Soble (ed.), The Philosophy of Sex: Contemporary Readings, 4^th ed. Lanham, MD: Rowman and Littlefield: 39-55.
Graham, George. (2010). The Disordered Mind: An Introduction to the Philosophy of Mind and Mental Illness. Routledge: NY.
Graham, George. (2013a). The Disordered Mind: An Introduction to the Philosophy of Mind and Mental Illness. Routledge: NY.
Graham, George. (2013b). “Ordering Disorder: Mental Disorder, Brain Disorder, and Therapeutic Intervention” in K. Fulford (ed) Oxford Handbook of Philosophy and Psychiatry. Oxford UP.
Graham, George. (2014). “Being a Mental Disorder” in Harold Kincaid & Jacquieline A. Sullivan (eds.) Classifying Psychopathology: Mental Kinds and Natural Kinds: 123-143.
Greenspan, Patricia. (2003). “Responsible Psychopaths” Philosophical Psychology 16(3): 417-429.
Grob, G.N. (1991). “Origins of DSM-I: a study in appearance and reality.” American Journal of Psychiatry 148(4): 421-431.
Gurley, Jessica. (2009). “A History of Changes to the Criminal Personality in the DSM” History of Psychology 12(4): 285-304.
Hacking, Ian. (1995). Rewriting the Soul: Multiple Personality and the Science of Memory. Princeton, NJ: Princeton University.
Hacking, Ian. (1999). The Social Construction of What? Cambridge: Harvard University Press.
Hansen, Jennifer. (2004). “Affectivity: Depression and Mania” Philosophy of Psychiatry Oxford University Press: 36-53.
Hare, R.D., Clark D., Grann M., Thornton D. (2000). “Psychopathy and the Predictive Validity of the PCL-R: An International Perspective.” Behavioral Sciences and the Law 18(5): 623-45.
Haslam, Nick. (2014). “Natural Kinds in Psychiatry: Conceptually Implausible, Emprically Questionable, and Stigmatizing” in Harold Kincaid & Jacquieline A. Sullivan (eds.) Classifying Psychopathology: Mental Kinds and Natural Kinds: 11-28.
Herrera, C.D. (2013).“What’s the Difference?” Ethics and Neurodiversity Cambridge Scholars Publishing: Newcastle upon Tyne: 1-17.
Horowitz, Allan V. (2001). Creating Mental Illness. University of Chicago Press.
Johnston, Josephine and Carl Elliott. (2002). “Healthy limb amputation: ethical and legal aspects” Clinical Medicine 2(5): 431-435.
Kandel, Eric. (1998). “A new intellectual framework for psychiatry.” American Journal of Psychiatry 155: 457-469.
Kendell, R.E. (2004). “The Myth of Mental Illness.” In J.A. Schaler (Ed.) Szasz under Fire: The Psychiatric Abolitionist Faces His Critics. Peru, Illinois: Open Court.
Kraepelin, Emile. (1896a) Psychiatrie (8th edn). Reprinted (1971) in part as Dementia Praecox and Paraphrenia (trans. R. M. Barclay). Huntington, NY: Robert E. Kreiger.
Kraepelin, Emile. (1896b) Psychiatrie (8th edn). Reprinted (1976) in parts as Manic—Depressive Insanity and Paranoia (trans. R. M. Barclay). Huntington, NY: Robert E. Kreiger.
Levy, Neil. (2007). “The Responsibility of the Psychopath Revisited” Philosophy, Psychiatry, and Psychology: 129-138.
Lilienfeld, S.O. and L. Marino. (1995). “Mental disorder as a Roschian concept: a critique of Wakefield’s “harmful dysfunction” analysis.” Journal of Abnormal Psychology 104(3): 411-20.
Lombardi, E. (2001). “Enhancing Transgender Care.” American Journal of Public Health 91(6): 869-872.
Lysiakm M. and Bill Hutchinson. (2013). “Emails show history of illness in Adam Lanza’s family, mother had worries about gruesome images.” New York Daily News. Web. April 2013.
Maddux, James. (2001). “Stopping the Madness.” The Handbook of Positive Psychology: 13-25.
Mueller S. (2009). “Body integrity identity disorder (BIID)-Is the amputation of healthy limbs ethically justified?” American Journal of Bioethics; 9: 36–43.
Murphy, Dominic. (2014). “Natural Kinds in Folk Psychology and in Psychiatry.” in Harold Kincaid & Jacquieline A. Sullivan (eds.) Classifying Psychopathology: Mental Kinds and Natural Kinds: 105-122.
Nadesan, M.H. (2005). Constructing Autism. Milton Park, Oxfordshire: Routledge.
Nichols, Shaun and Manuel Vargas. (2007). “How to Be Fair to Psychopaths.” Philosophy, Psychiatry, and Psychology 14(2): 153-155.
Philips Katharine, et. al. (2010). “Body Dysmorphic Disorder: Some Key Issues for DSM-V”. Depression and Anxiety 27:573-591.
Pickard, Hannah. (2009). “Mental Illness is Indeed A Myth” Psychiatry as Cognitive Neuroscience: 83-101.
Pickard, Hannah. (2011). “What is Personality Disorder?” Philosophy, Psychiatry, and Psychology Vol. 18 (3): 181-184.
Pickering, Neil. (2003). “The Likeness Argument and the Reality of Mental Illness” Philosophy, Psychiatry, and Psychology 243-254.
Ramachandran, V., and McGeoch, P. (2007).” Can vestibular caloric stimulation be used to treat apotemnophilia?” Medical Hypotheses 8: 250–252.
Ramirez Erick. (2013). “Psychopathy Moral Reasons and Responsibility Ethics and Neurodiversity Cambridge Scholars Publishing: Newcastle upon Tyne: 217-237.
Ramsey, W., Stich, S. and Garon, J., (1990). “Connectionism, Eliminativism and the Future of Folk Psychology,” Philosophical Perspectives 4: 499–533.
Robertson, Erik D. and Lennart Mucke. (2006). “100 Years and Counting: Prospects for Defeating Alzheimer’s Disease.” Science: Vol. 314 no. 5800 pp. 781-784.
Rosenhaun, David. (1973). “On Being Sane in Insane Places” Science: 250-258.
Sam, David and Virginia Moreira. (2012). “Revisiting the Mutual Embeddedness of Culture and Mental Illness” Online Readings in Psychology and Culture.
Soble, Alan. (2004) “Desire Paraphilia and Distress in DSM IV.” Philosophy of Psychiatry Oxford University Press: NY: 54-63.
Strawson, P.F. (1962). “Freedom and Resentment.” Proceedings of the British Academy 48: 1-25.
Szasz, Thomas. (1961/1984). The Myth of Mental Illness Harper Perennial.
Szasz, Thomas. (1979). Schizophrenia: The Sacred Symbol of Psychiatry. Oxford: Oxford University Press.
Talbert, Matthew. (2012) “Moral Competence, Moral Blame, and Protest.” Journal of Ethics 16(1): 89-109.
Vargas, Manual and Shaun Nichols. “Psychopaths and Moral Knowledge” Philosophy, Psychiatry, and Psychology 2007: 157-162.
Von Eckardt, Barbara and Jeffrey Poland. (2005). “Mechanism and Explanation in Cognitive Neuroscience” Proceedings of the Philosophy of Science Association: 972: 984.
Wakefield, Jerome. (1992). “The Concept of Mental Disorder: On the Boundary Between Biological Facts and Social Values” American Psychologist: 373-388.
Wakefield, Jerome. (1999). “Evolutionary versus prototype analyses of the concept of disorder.” Journal of Abnormal Psychology 108: 374-399.
Wakefield, Jerome. (2006). “What Makes A Mental Disorder Mental?” Philosophy, Psychiatry, & Psychology 13(2): 123-131.
Wallace, R.J. (1994). Responsibility and the Moral Sentiments. Cambridge, Mass: Harvard University Press.
Watson, Gary. (1996). “Two Faces of Responsibility.” Philosophical Topics 2: 227-248.
Woolfolk, Robert. (1999). “Malfunction and Mental Illness” The Monist 82(4): 658-670.
Zachar, Peter. (2014). A Metaphysics of Psychopathology, MIT Press: Cambridge Massachusetts.

Author Information

Erick Ramirez
Email: ejramirez@scu.edu
Santa Clara University
U. S. A.

Roy Wood Sellars (1880—1973)

Roy Wood Sellars was one of a generation of systematic philosophers in America the likes of which has not been seen before or since. He was born in Seaforth, Ontario in Canada, and spent most of his career at the University of Michigan where he continued working well into his 90s. He was a fiercely independent thinker who resisted the fashions of the day in order to follow his own instincts. He believed that the philosopher should be well-grounded both in the history of philosophy and in the sciences, and that the philosopher should engage philosophically with the major moral, social, and political issues of the day. His central aims were to combine and harmonize the insights of science and common sense, to update religion with the scientific advances of the day, and to promote a science-grounded system of progressive humanistic values. Over the course of his long life, Sellars wrote and published prolifically. He is the author of 15 books, over 100 articles, 14 book reviews and several miscellaneous works. He is best known for his pioneering formulations of critical realism (roughly, the view that, first, human beings normally perceive independent objects with their sensations but do not perceive sensations, and, second, human beings must interpret their sensations), evolutionary naturalism (a naturalistic version of emergent evolution), the “double knowledge” and mind-brain identity theory (the view that human beings possess two modes of knowledge of a single material reality), and a defence of religious humanism (the view that religion must be reinterpreted in terms of its role in improving humanity’s “this-worldly” existence). He is the primary author of the Humanist Manifesto I of 1933. Finally, he is the father of Wilfrid Sellars, a highly influential philosopher in his own right, many of whose views, allowing for the different vernacular and emphasis of the two periods, are continuous with his father’s views.

Biography
Critical Realism
Evolutionary Naturalism
Organicism
Value Theory
Socialism
The Humanist Manifesto
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. Biography

Roy Wood Sellars (July 9, 1880-Sept. 5, 1973), was born in Seaforth, Ontario, the second son of Ford Wylis and Mary Stalker Sellars. (Warren 2007, 211 lists Sellars’ birth year as 1883, but this is an aberration. Most sources, including Warren elsewhere, all give the 1880 date. See Warren 1970, xi-xxv; 1973, 19-22; 1975, Ch. 1; Frankena 1973-74.) His ancestors had migrated from the Glasgow region in Scotland to Nova Scotia and later moved to Ontario. In Ontario, the Sellars’ clan married into the prestigious Wood family, which included a distinguished Captain from the War of 1812 (David Wood) and the acting commissioner of the North West Mounted Police and Commissioner of the Canadian Yukon Territory (Zachary Taylor Wood). This made him a relative to the 12^th president of the United States (Zachary Taylor). Sellars also took great pride in the fact that one of his ancestors, Lord Stanley, appears in Bosworth Field in Shakespeare’s Richard III.

Roy’s father, Ford, had been a schoolteacher and a school principal until health considerations forced him to abandon that profession. Thereafter, Ford studied at the Medical School at the University of Michigan and became a physician in 1882. After graduating from medical school, the Sellars family settled in Pinnebog, Michigan. As this was a small town, Roy’s youthful companions were farm boys. In his youth, Roy pursued swimming, baseball, and hockey, and retained an interest in sports all his life. His father’s library was the only one in the neighbourhood, and though young Roy knew little about philosophy, he read Emerson and Carlyle and had numerous discussions with his father about medicine. In this small rural community, Roy’s intellectual gifts quickly set him apart and he was sent to the Ferris Institute in Big Rapids, Michigan to prepare for a university career.

Roy entered the University of Michigan in 1899, where he did his own cooking and washed dishes for his lodgings. Due to his small-town, rural background, the insecure young boy felt unprepared for a university program but he resolved to “make a go of it” and, upon his graduation, was voted one of the top two scholars in the class. He studied widely in both the arts and the sciences, including rhetoric and calculus. Sellars received his B.A. in 1903 from the University of Michigan and went on to Hartford Theological Seminary, where he studied New Testament Greek, Hebrew, and Arabic (and read the Koran in the original). He acquired a critical historically and culturally grounded approach to religion and a sympathy for social liberalism and humanism that remained with him throughout his life. In 1904 Professor R.M. Wenley of the Department of Philosophy at Michigan recommended Sellars for a fellowship at the University of Wisconsin, where he studied for a time before returning to the University of Michigan as Professor Wenley’s replacement while the latter was on sabbatical leave. Apart from a brief stint at University of Chicago in the summer of 1906 and a year studying in Europe (either 1908-09 or 1909-10 – sources differ on this), Sellars remained at the University of Michigan for the remainder of his approximately 40-year career, first as an instructor and doctoral student (he earned the Ph.D. in 1908 or 09 – again, sources differ), and then as a member of the permanent faculty.

During his year in Europe Sellars studied at the Sorbonne and discussed the possibility of a naturalistic formulation of emergent evolution with Henri Bergson. Bergson in turn referred him to the scientist and vitalist Hans Driesch. Sellars went on to study with Driesch and the neo-Kantian Wilhelm Windelband at Heidelberg. The precise details of Driesch’s influence on Sellars are not known but it seems likely that he directed Sellars to the study of physiology. After returning to Michigan from his European adventures, Sellars developed a new course in the philosophy of science in which he used James Ward’s Naturalism and Agnosticism, as well as texts by Huxley, Mach, Poincaré, and Pearson. Many of his students at this time came from the physical and biological sciences. Sellars remained scientifically-oriented throughout his life, a trait which he passed to his son Wilfrid. Even when Sellars was inspired by Bergson’s romantic or mystical theory of creative evolution, he sought (much like Popper) to recast it in more “naturalistic” terms acceptable to the sciences. Sellars’ naturalistic bent put him at odds with his most ardent supporter, Professor Wenley. Although Wenley regarded him as his best student, he could not accept Sellars’ naturalism, and did not approve of the publication of Sellars’ thesis by the University.

Sellars enjoyed considerable teaching success. His course, “The Principles and Problems of Philosophy,” was favorably remembered by many alumni who found it a “liberating” experience, “like taking a cold bath” (Frankena, 1973-74, 230). Several of the students in his political philosophy course, in which he discussed democracy, communism, socialism, and fascism, remarked that though they had expected him to be a propagandist, the course turned out to be a good scholarly treatment of the issues with no discernible bias. Sellars had earlier taught a course in elementary logic and eventually published a textbook based on that course. It was a chance reading of that textbook by Charles Stevenson that led him to the study of philosophy and later become one of Sellars’ colleagues (Frankena, 1973-74, 230).

Sellars married his cousin Helen Maud Stalker, an intelligent and accomplished woman, in 1911. He wrote the Preface to Helen’s translation (from the French) of Celestine Bougle’s Evolution of Values. Helen provided Sellars with much support and they remained close until her death in 1962. In 1912 and 1913, respectively, their two children, Wilfrid and Cecily, were born. Cecily become a state-employed psychologist in North Dakota, but was killed in an automobile accident in 1954, an event which impacted Sellars’ scholarly work for decades. Twenty years later, well into his 90’s, he was still working on papers that had been in progress at the time of her death. Wilfrid Sellars went on to become a highly influential philosopher in the latter half of the 20^th century who, like his father, emphasized a firm grounding in the history of philosophy, fluency in the sciences, and a systematic approach to philosophical problems. It is noteworthy that his son Wilfrid developed a sophisticated version of scientific realism that builds on his father’s critical realism. In fact, Wilfrid’s views are often similar in substance to his father’s even if they differ in language and style.

Sellars believed in a fruitful, reciprocal relationship between epistemology and ontology, but saw epistemology as philosophically basic. His most vehement criticism of other philosophers was often that they were weak in epistemology, but he also considered himself a proud ontologist. Sellars also had a strong interest in ethics, social philosophy, and political philosophy.Indeed, Sellars belonged to a genre of philosophers, which includes his son Wilfrid, that is rare today, who believed that a philosopher must be knowledgeable in virtually all areas of philosophy. Sellars made contributions to epistemology, metaphysics, ethics, the philosophy of science, social and political philosophy, and the history of philosophy. He could discourse in an intelligent and informed a way about Heidegger, Sartre, and Bergson just as he could about Russell, Carnap, or Einstein. He was as at home in a discussion about ethics or social and political philosophy as he was in logic or scientific method.

Sellars was an independent thinker who resisted the fashions of the day in order to pursue his own philosophical direction.He formulated what may have been the most viable form of realism in his era. He offered a course, titled “Main Concepts of Science,” that may have been the very first course offered anywhere in the philosophy of science. He formulated evolutionary naturalism, the view that life and mind are emergent products of naturalistically conceived evolution (i.e., without invoking the supernatural element in Alexander or Bergson’s élan vital). He (1923b; 1938a) pioneered the identity theory of the “brain-mind,” which he called the “double knowledge emergence approach” to mind-brain identity. Although his basic views changed little over his career, he was constantly reformulating, developing, and clarifying them. In his later years he watched as many of his views became commonplace, without being recognized for his role in their genesis.

Perhaps because of his fierce independence, Sellars often found himself out of the mainstream. Until 1930, philosophy was dominated by idealism and pragmatism, religion by theism, and social theory by capitalism, while Sellars was a realist, an atheist, and a socialist. Later, analytical philosophy came into dominance and fundamentalism resurged in religion, neither of which appealed to him. Socialism did eventually enjoy a resurgence, but it was Marxist and totalitarian while Sellars was committed to a more moderate and gradual reform of social institutions based on rational persuasion. Sellars was also critical of the American philosophy in his day. He (1970a, vii; see also Warren, 1975, 28) once remarked that, amongst philosophers, it is “almost always a Sellars against the world”. He often felt that he was better understood by psychologists and biologists than by philosophers and that he was better understood in Europe than in America (Warren 1975, 25).

Nonetheless, Sellars was a respected member of the philosophical community in America and it is safe to say that he inspired a personal affection from many of his colleagues that is unusual. He served as Vice-President of the Eastern Division of the APA in 1918 and President of the Western Division in 1923. He was an energetic correspondent and carried on friendly discussions with Samuel Alexander, C. Judson Herrick, Lloyd Morgan, and Marvin Farber. He also corresponded with F.H. Bradley, Bernard Bosanquet, C.A. Strong, and Donald Williams, and he debated with D.C. Macintosh, H.N. Wieman, and Sydney Hook. In 1954 the journal Philosophy and Phenomenological Research devoted an entire issue to Sellars’ philosophy, and in 1964 Andrew Reck listed Sellars as one of the 10 most notable philosophers in recent American philosophy. At the University of Michigan the Roy Wood Sellars Chair was created in his honor and Bucknell University honored him by establishing the Roy Wood Sellars Lecture Series. The first Roy Wood Sellars Lectures were given by Warren and the second by Wilfrid Sellars with Roy Sellars in attendance. In 1970 Notre Dame University honored Roy in his 90^th year with a symposium on his philosophy, including presentations by Andrew Reck, Wilfrid Sellars, and C.F. Delany.Although Roybelonged to a generation of America’s greatest systematic philosophers, Frankena (1973-4, 231) observes that, with hindsight, Sellars may have been one of the most important of them. However, the fact that his son Wilfrid has developed a powerful formulation of his father’s views may be the greatest testament to Roy Wood Sellars’ lasting achievement.

2. Critical Realism

Much of Sellars’ philosophical work is an attempt to replace outdated mythopoetical views about knowledge, religion, values, and so forth, by up-to-date scientifically grounded views. Science, he holds, “builds” on common sense, but since it develops new concepts based on new instruments and the application of mathematics to experience, the philosopher’s job is to harmonize the common sense and scientific frameworks (1932a, v; 1973, 160-161).

In his first book, Critical Realism , he attempts to justify common sense realism, which is also the view of philosophers when they are not in a reflective mood (1916a, 6)—the view that people perceive real external objects, not just intermediaries of some kind. He also aims to clarify the relation of common sense realism to scientific knowledge: “We start from independent things; and not from percepts” (1916a, 3). Sellars also argues against the main theories of perception of his day: idealism, representationalism, pragmatism, and positivism, all of which he saw as undermining common sense realism. Other versions of critical realism were espoused by Santanaya and Lovejoy.

The defence of common sense realism, he (1941b; 1959c) holds, requires a robust defence of the correspondence theory of truth.The basic error in those mistaken views of perception is the failure to distinguish between the content and object of perception (1922a, 70 n 4).Since the content of perception is fixed by aspects of the organism, those mistaken theories wrongly infer that the object of perception is not independent of the perceiver.

Sellars’ critical realism requires real substances (as opposed to ideas, universals, impressions, and so forth) as objects of perception. He (1929c; 1970a, 32; 1973, 182, 346-348, 353) rejects “the historical desiccation of the category of substance,” that is, the whittling down of the Ancient and Medieval robust notion of substance to Locke’s “I know not what”. While representationalism, idealism, pragmatism, and positivism tend to volatize the object of perception into ideas, sensations, or a mere placeholder for properties, Sellars holds the normal objects of perception are real full-bodied independent substances.

Although Sellars’ critical (or “referential”) realism is “built up from” common-sense realism, it is not identical with common sense since the latter has not faced the problems arising from discrepancies in perception (See his 1922c; 1924b; 1927b; 1927c; 1937a; 1938b; 1939b; 1959b; 1961a; 1962; 1963; 1970, 6-8, 13, 15-16, 17-27, 33-35, 161; see also Warren, 1975, 35, 37). Despite his defense of common sense realism,Sellars rejects the “naïve realism” that identifies the immediate datum of knowledge with objects in the world. He distinguishes between the common sense realism of the ordinary person and the crude philosophical understanding embodied in naïve realism, the view that in perception one actually “intuits” the object (1963; see also Warren, 1975, 36-7). In opposition to that naïve view, he holds that in perception one interprets one’s sensations. The interpretation of sensations is not a purely intellectual process: “A gull does not in the Lockean way apprehend his sensation …. [It] looks through his sensation at the fish in the water. It is a one-step sensi-motor process” (See his 1970a, 118; 1973, 49-50, 161; 1975; and Warren, 1975, 38-45!).

Sellars holds that the biological basis of knowledge consists in the organism’s adjustment to its external environment, where both the internal adjustment of the organism and external factors must be taken into account. He sees his version of critical realism as a “mediate realism” that attempts to do justice both to the contribution of the perceiving organism and the claims of objective knowledge. That is, he aims to do justice to both the real and the “ideal” sides of cognition. It is absolutely crucial, he (1922a, 76-77) stresses, to distinguish between the causal conditions of perception and the referential act of perceiving. Perception is the interpretation of sense mediated by factors both internal and external to the perceiving subject. These internal factors are not to be confused with the mechanism or processes that underlie perception (that is, the account of the internal mechanism or processes is not an account of the content of perception). By taking account of both the internal and external factors, he seeks to avoid the evils of both naïve realism and the non-realist view that the objects of perception are not independent of mind.

The attempt of simultaneous justice to both the subjective contribution of the organism and the claims of objective knowledge is no easy matter. Various critical realists could not always agree how best to formulate the view (See Ramsperger, 1967). For example, Sellars (1970a, 5) rejects the sort of critical realism espoused by Santayana that erects a barrier of essences between the perceiver and the external object. Perhaps his basic point is that human beings perceive independent objects with their sensations, but do not perceive sensations, essences, or other mental or ideal intermediaries themselves (Warren, 1975, 38, 42). Sellars (1970a, 114-5) stresses that the fact that the object is present to consciousness does not mean that it must be present within consciousness.

Although Sellars’ sometimes wrote as if his version of critical realism is definitive,few agree that it is unproblematic. Since he acknowledges the subjective contribution of the perceiver, it can resemble representationalism. Since, however, he emphasizes that perception is a direct perception of independent objects, it can resembles naïve realism. Sellars counters that critical realism is the view that human knowing is a direct knowledge of objects, but that this knowledge is mediated by “logical ideas” (See his 1970a, 113 and the “Epilogue on Berkeley” in his 1968). The problem is that it is hard to see how knowledge can be both mediated and direct. The claim that one perceives independent objects via one’s sensations but does not perceive those sensations themselves is a fair negative point, but seems to require a more robust positive account of the precise role of sensations in the perception of external objects. Sellars’ version of critical realism is intriguing, but many feel it requires further clarification (Chisholm 1955; Herbert, 1994; Wright, 1994; Levine, 2007).Perhaps this is why Sellars continued to return to the issue again and again over the decades (See his 1929a; 1929b; 1929c; 1937a; 1938b; 1939b; 1950a; 1961a; 1962; 1963; 1965; 1969d, Ch’s 4-5; 1970a, 112-131; and so forth).

3. Evolutionary Naturalism

Sellars does not have a fully developed philosophy of science, this being more characteristic of his son’s generation, but he does have definite views about scientific method and about the close relation of science to philosophy, some of which do anticipate his son’s views. Sellars’ conception of science and its relation to philosophy is intimately related to his own views of evolutionary naturalism.

In Sellars’ (1973, 160-1) view, science “builds” on common sense, but it develops new concepts based on new instruments and the application of mathematics to experience, and so forth. He rejects the monochrome Newtonian universe in favor of an evolution-generated hierarchy of different levels of emergent causality: Under certain favorable conditions, life emerges from matter and mind from life (See his 1920c; 1922a, Ch. IX; 1924a; 1927a; 1933a; 1944b; 1959a; 1932, 4; 1969d, 64-68; and 1973, 290). He is committed to the emergence of downward causal forces. That is, while the emergence of higher-order entities is causally dependent upon lower-order entities (bottom-up causation), once they emerge, the former may causally influence the latter (top-down causation) in ways not reducible to bottom-up causation (see Roy’s 1970, 38, 44-46 and Meehl and Sellars 1956). Sellars insists that the higher emergent entities are still material systems.

Although he does not deny the possibility of reductions in special cases, his conception of science is generally anti-reductionist (1922a, 16, 332; 1970, 136, 141, 240-1; Warren, 1975, 29).This explains why he holds that the scientific method cannot be identified with that of any particular science, such as physics (Warren, 1975, 29). When he (1932a, 5) describes his own view as physicalism, he does not mean physicalism in the more familiar sense but a view that accepts his own critical realism and emergence. Each of the sciences; natural, psychological, and social, treats of a particular domain in the emergent hierarchy, but none is privileged over the others.

The commitment to real independent substances in his critical realism dovetails with his evolutionary naturalism. The different levels in the emergent hierarchy are not just of events or properties, but of substances (1922a, Ch. XIII; 1932a, Ch. XII; 1943c; 1959a; 1970, 215). Though the higher emergent levels are not reducible to material mechanisms, they do not introduce new non-natural forces. Life and mind are not non-natural forces entering nature from outside, but emergent capacities of natural substances (See his 1917b, 276-283; 1922a, vii-ix, 277-278, 333-336; 1933a; 1950b). See Emmet (1932, 222-23) for Whitehead’s very different Platonistic view)!

Sellars tends not to employ the classical formulation of emergence, that certain wholes are “greater than the sum of their parts”. He (1922a, 302) does, however, use such formulations occasionally. See also his remarks on the relations of wholes and parts (1917a, 31, 145, 288). Since he talks of new unitary substantial wholes, talk of separable “parts” may be seen as misleading.Wilfrid Sellars (1949) clarifies his father’s somewhat obscure views. In general, however, in language reminiscent of Bergson but understood naturalistically, Sellars (1922a, viii, 17, 139, 167, 214-215, 297, 303, 322, 335, and so forth; 1932a, 3, 401; Blitz 2010) holds that modern science is beginning to accept the notion of “creative synthesis”, the view that change sometimes involves “the genesis of what Locke called ‘real essences’”.For a discussion of the classical part-whole formulations of emergence see McDonough (2002).

Agential causality, which is central to Sellars’ ethics, is underwritten by his evolutionary naturalism (1970a, 262-267). Agential causality emerges at a certain level of evolution and organization (1970a, viii; 1973a, Ch. 15). Human beings possess no “pushbutton free will,” but rather, an emergent capacity of the human brain is able to develop new judgments and standards that make a causal difference in behaviour (1932a, 396, 405; 1957a; 1959a; 1970a, 305; 1973a, 290-1, 361-384). He called his view “critical anthropomorphism” (1917b, 278).

Sellars’ evolutionary naturalism colors his view of the relation between science and philosophy. The diversity of the various irreducible levels in the emergent hierarchy requires a diversity of distinct autonomous sciences: physics, chemistry, biology, and so forth. This yields problems with which none of the special sciences are prepared to deal. The physicist can describe the behaviour of subatomic particles, but, qua physicist, is unfamiliar with the regularities and properties at higher levels in the emergent hierarchy. Similar points, in reverse, can be made about the biologist (psychologist, sociologist, and so forth), who are familiar with the objects at their higher levels of the hierarchy, but qua biologist, psychologist, sociologist, and so forth, are unfamiliar with the laws and properties at the lower levels. Since, however, the evolutionary naturalist holds that the different levels in the emergent hierarchy constitute autonomous regions that fall outside any of the particular sciences, and since the items at different levels of the emergent hierarchy are linked in interesting ways that cannot be captured by reductions of one level to another, knowledge of the interrelations between these levels requires a different sort of knowledge, not possessed by any of the special sciences.

It is the distinctive job of the philosopher to obtain an overview of the relations between the different sciences, and between the sciences and the common sense framework, harmonize the new levels in the emergent hierarchy with each other and with the more stable and fixed background of inorganic nature (1922a, 263, 329; 1932a, 44ff, 79ff, 92ff). Thus, philosophy completes science. “The job of philosophy is to size up the whole situation; and it often needs new leads” (1973, 161).One can see here the general outlines of his son’s (1991, 2, 18-19, 34, and so forth) view, that the distinctive job of philosophy is to obtain a synoptic view of the way things hang together, in the broadest sense.

Sellars published Evolutionary Naturalism in 1922, a year before both Morgan’s Emergent Evolution and Alexander’s “Natural Piety” (Warren, 1970, vi), although the latter two came to be better known for the formulation of emergent evolution. Warren (1973b) remarks that Morgan told Sellars that to his knowledge, Sellars was the first to publish on emergent evolution. Bergson’s Creative Evolution, first published in 1907, does precede Sellars’ publications, but it differs in that it posits the non-scientific élan vital. Sellars saw his position as more systematic, empirical, and naturalistic than Bergson’s and Morgan’s since it does not introduce any non-natural controlling factors. Although Sellars’ evolutionary naturalism fell out of favor as reductionism gained ground, emergentism has once again arisen as a viable position in science, philosophy and religion (Beckermann, Kim, and Flores 1992; Hasker 2001; McDonough 2002; Davies and Clayton 2008, Blitz 2010; Vintiadis, and so forth).

4. Organicism

Although Sellars (1991, 415, 433) states that no other writer in recent times had challenged him as much, he claims that his own view deserves the title “philosophy of organism” more than Whitehead’s. This is because Sellars sees living organisms as substantial wholes, whereas Whitehead sees them as a societies or nexuses of more fundamental entities. Sellars (1922a, vii-ix, 164-168) sees an organism as a product of emergent evolution in which simpler materials at a lower level are organized into new integrated substances with new causal powers at higher levels in the hierarchy. This higher-order substance is a true unity and not, as for Whitehead, a plurality (see Roy’s 1961b).

The living organism is, for Sellars, the background against which consciousness must be understood (1922a, 63, 298; 1932a, 446-7; 1949b, 95, 99). This leads him (1991, 415; 1970, 205) to agree with contemporaneous developments in physics, chemistry, biology, and psychology that emphasize fields and Gestalten, both of which are wholes that are not reducible to more fundamental entities. Even so, the focus on the important organismic background should not lead one to confuse knowledge of the object with knowledge about the organism (1922a, 186-187). For similar reasons, he does not see a person as a combination of two separable substances as in Cartesian Dualism. He (1991, 415) describes his own position, which rejects the vitalistic and non-evolutionary elements in classical Aristotelianism, as an “Aristotelianism of the Left”. The same considerations lead him (1932a, 14-15; 1961b; 1973, 354-56; 1991, 416-7) to oppose the “reformed subjectivism” which he saw as the source both of the Platonism and rejection of naturalism and humanism in Whitehead’s philosophy of organism.

5. Value Theory

Sellars’ evolutionary naturalism make values “centripetal” to human life and supports a humanistic theory of ethics and religion (1932a, 448; 1948b; 1949b, 78; 973, Ch. 14), all of which he counts as a virtue He holds that human freedom emerges at a certain level or organization of organic development and lends a dignity and meaning to human life that is absent in a purely mechanical cosmos (1957a; 1949b, 103-4; 1970, 319-331). Whereas the “old materialism” had been criticized as being unable to accommodate higher values, Sellars sees it as a virtue of his “new materialism” that it “flowers into humanism” (See also his 1932a, 19; 1944b; 1950b, 427-428). The emergence of living organisms from inorganic nature is a necessary condition for the existence of a world of values (1932a, 446-7). It is people and human institutions that form the “hot center” of conscious life, while the inorganic world forms the “periphery and yet absolute condition for the whole drama” (1932a, 450).

Sellars is generally averse to ontological dualisms (1916a, 204, 245; 1922, 3091973a, Ch. 14; see also Sellars, McGill, and Farber, 1949) and holds they have done particular damage in value theory (see Roy’s 1917b, Ch. XVI; 1918, Ch. XII and Ch. XVI; 1921a; 1950b; Warren, 1975, 27, 41-2). In general, he holds that each side in value-dualisms captures some fragment of the truth, but in their pure forms such dualisms are incapable of yielding a coherent theory of value. Whereas some theories emphasize the objective basis, and others the subjective basis, for values, Sellars’ aims to do justice both “to the possibilities in the object and in the subject,” while taking “as objective a view of value as possible” (1932a, 445, 475; 1969d, Ch. 12). He sees this as an area where compromise and balance are essential. Value judgments are similar to cognitive judgments in some ways, but different in others. One can make mistakes in value judgments just as in cognitive judgments, but physical science does not discover values as properties of objects (1932a, 445; 1973, 344). Rather, values are an interpretation of objects as having the capacity to affect human life in ways important to an individual or group (1932a, 445, 459-473; Warren 1975, 40).

In cognitive judgments, human beings regard themselves as disclosing the object itself, while in value judgments human beings are estimating the object with respect to its bearing on human life (1932a, 46).When the subjectivist claims that values are based on feelings, Sellars agrees, but holds that these subjective feelings are directed towards facts that can be objectively criticized. When the objectivist claims that values are based on objective facts, Sellars agrees, but holds that these facts only have value when “estimated with respect to human living” (1932a, 444). In valuing we are constrained by objective factors just as in perceiving, but we are also “interpreting” the object in the light of factors which are taken to be intimately linked to the self (1932a, 471; 1970, 244, 253, and so forth). It is important to acknowledge that Sellars (1922a, 29-30, 194-5, 312; and see and Wood, 1950, 525) does see the need for a kind of dualism in epistemology.

Sellars subjects “absolutism” and “factualism” about values to similar criticisms. He (1932a, 457-459) rejects belief in absolute or intrinsic values since “a good which is not good for someone strikes me as meaningless”. He (1932a, 16ff) describes “Eleatic views” that deny the significance of everyday beliefs as versions of “illusionism”. Similarly, when the “factualist” attempts to reduce values to some fact about human beings or human groups, for example, the fact that human beings prefer certain things and not others, Sellars (1932a, 452-3; 1970a, 245) replies that people are not like stones with only one possible reaction. That is, alluding to his critique of “naïve realism”, these various “facts” are always really only some naïve immediate value (1932a,452). Even if some authority, for example, a church or an anthropologist, holds X is good, it is always possible to criticize that naïve immediate valuation by estimating its effect on human life. No authority, neither religious nor “scientific”, is immune to criticism.

Sellars (1932a, 446-7) stresses that “the background” to judgments of value is the emergent level of living organisms presupposed by the existence of value.Since an organism emerges from inorganic nature in the evolutionary process, his evolutionary naturalism is an essential part of his account of the genesis of the complex subject-object situations required for the existence of value (1922a, Ch. XV; 1932a, 68, 442; 1970a, 248-9, 267). Referring to his “open ended” emergent evolutionism (1970a, 267), he states that his “metaphysics of ethics in many ways represents its culmination” and that any attempt to explain the existence of value by reference to mere lifeless nature cannot succeed (1973, 359-60).

Sellars’ evolutionary naturalism is not just another version of materialism, but is enriched by his belief in the evolution of an emergent hierarchy containing the higher levels organisms and persons (1950b, 420, 422-6; 1970a, 154-173). His naturalism “does not,” as some older versions of materialism, focused only on the physical sciences, did, “ignore the specialized areas of human living, morals, art, politics” (1932a, 449). Because man is “not just a knower but an agent” and a “desirer of good things”, the philosopher, in order to avoid an overly narrow conception of the human situation, must turn to the poets for a sense of “creative agency and decision” (1932a, 449).

6. Socialism

In The Next Step in Democracy (1916b) Sellars defends his own version of socialism (See also his 1970a, 272-73, 277-79, 289, 311, 334). Sellars distinguishes three stages of socialism: (1) the Utopian socialism of Fourier and Saint-Simon, (2) the “political socialism” that began with Marx’s Communist Manifesto, and (3) the later modification of Marx’s socialism based on an updated understanding of how society and people really work (1944-45b; 1970a, 272). The political socialism of Marx is called “scientific socialism” by its admirers, “orthodox socialism” by its critics (1970a, 279ff).

Sellars also rejects Utopian socialism as naïve and romantic, having little understanding of the obstacles to the creation of a genuine socialist society (1970a, 81). In contrast to the Utopian socialists, Sellars promotes a gradual modification of existing institutions in the light of new scientific advances with a full awareness that any “reckless unsettling” of the social foundations leads to disaster (1970a, 280, 292-293). Sellars rejected the program to overthrow tradition on the basis of naïve romantic dreams of wishful thinking.

Although Sellars (1970a, 28-287) sees Marx as a fairly realistic and concrete “sort”, he holds that Marx was misled by revolutionary ardour into seeing history as a constant war of class struggle. Sellars, by contrast, sees the Marxian stage of socialism, not so much scientific as realistic, but he thinks Marxist realism (the recognition that the old order will not easily give way to rational persuasion) led to the introduction of a dangerous militancy into socialism. Further, whereas many saw Marx’s determinism as a strength, Sellars takes Marx’s view that capitalist society contains the seeds of its own destruction as empirically falsified (1970a, 308). Further, Marx underestimated the ability of capitalism to make adjustments (1970a, 284, 286, 307-8; 1944-45b). Sellars (1970a, 287, 303-304) replaces Marx’s “semi-mechanical and almost wholly deterministic” outlook by the view that the people must learn to emancipate themselves by participation in the political process. Participation in the democratic process requires the development of the necessary virtues: cooperation and ingenuity, the application of continuous experiments to find out what works best, the determination and patience to approach the ultimate goal by slow degrees (1970a, 287). Whereas Marx seems to absolve the individual of responsibility for the eventual outcome by representing the march towards the goal as the inevitable result of the great supra-individual forces of history, Sellars (1971a, 333-334) emphasizes the essential educative role of the individuals participation in the process that renders the individual prepared for and worthy of the final goal. Although Sellars was sometimes seen as a radical in his day (1970a, 272), he defines socialism as a democratic movement whose aim is to secure the greatest justice and liberty for the maximum number of people at any given time, without the wholesale overturning of tradition by violent methods (1943d). In opposition to the militant socialism of old, he presents a moderate democratic recipe for achieving socialist goals via “rational reform” while escaping the “vicious dialectic of hate and counter-hate” (1970a, 291, 304). Progress cannot be achieved by one side imposing its view on the whole but by the “interplay” of conservatives on the one side and liberals on the other that the direction and speed of social progress is determined” (1916b, 3; 1970a, 307-308).

7. The Humanist Manifesto

Early in his studies, Sellars considered a career in comparative religion, but with his usual idiosyncratic twist, he wished to do so from a scientific, humanistic, and atheistic point of view. In Evolutionary Naturalism, he describes the religious impulse as “one of the most admirable … in human nature” (1922, 5; see also his 1918, 26 and his 1969a, Ch. 11), but he also holds that religion must be “brought to the world disclosed through science” (1918, 44-45, 222; see also Warren 1975, 24-25).Given his naturalism, the appeal to supernatural entities and explanations must be eliminated and replaced by an emphasis on human flourishing as citizens of a shared world (Wilson, 1995, Ch. 17). Whereas religions traditionally conceived salvation as something that comes to man from the outside, Sellars (1918, 12) sees it as something that must arise out of the “loyal union” of human beings who share a belief in the values of life. Traditional religions also often see creation as completed, meaning that a person’s job is merely to understand the pattern in order to follow it, Sellars (1947), reflecting Bergson’s influence, holds that people must learn to recognize creation as “a going concern,” in which their contribution to the further emergence of the universe is essential.

In 1932, Sellars was approached by Raymond Bragg on behalf of a Chicago-based group of humanists associated with The New Humanist (for which Bragg was an associate editor). The group had for some time been contemplating the need for an official statement of the religious humanist position, but recognizing the difficulties inherent in group authorship, chose to have a complete first draft written by a single author. After hearing him lecture in Chicago, Bragg approached Sellars about the project and Sellars accepted with the unanimous support of the Chicago group. The document published in the following year, the Humanist Manifesto of 1933 (or Humanist Manifesto I), is the result of numerous revisions by multiple contributors upon Sellars’ original draft. While that draft has been lost to history, the fact that Sellars signed the 1933 document, and later-on claimed primary authorship of it, suggests that whatever changes were made did not, in his mind, affect the substance of what he had written. For these reasons it has Sellars as the pre-eminent author of the Manifesto, although that is not to minimize the contributions of others.

In the Manifesto, Sellars attempts to put the essence of his religious humanism into a form suitable not just for fellow professors, but for the general public. It is important to remember that along with many of the original signers of the Humanist Manifesto I, Sellars conceived of humanism not as a replacement for religion but as a new religion (1918, Ch. XVI; 1969d, Ch. 11; Wilson 1995, Ch. 17). Nevertheless, his naturalized religion shades inevitably into a this-worldly humanist philosophy that, he (1932a, 7) holds, attempts to blend “those two great naturalists, Spinoza and Nietzsche, uniting the passion for life of the one with the cosmic calm of the other.”

Humanist Manifesto I was conceived as the statement of a new secular religion designed to replace the old religions that had been founded on claims of supernatural revelation, or on fear and helplessness (1918c, Foreword). It opposes an acquisitive and profit-motivated society, and outlines a mutually cooperative worldwide society committed to the rational resolution of problems. Thirty-four of sixty-five persons asked to sign did, including Edwin Burtt of Cornell, and John Dewey and John Hermann Randall of Columbia. About one-third of the signatories were professors from the University of Chicago and from Columbia University; about half were Unitarians (Wilson 1995, Ch. 10).

The Manifesto contains fifteen theses (briefly summarized here):

The universe is self-existing, not created.
Man is a part of nature that has emerged in a continuous process.
Since humanists hold an organic view of life, they reject the traditional mind-body dualism.
Man’s religious culture is a result of gradual natural development as a result of man’s interaction with the natural environment and social heritage.
Science has shown that supernatural and cosmic guarantors of human values are unsupported, so religion must re-formulate its views in the light of scientific knowledge.
Theism, modernism, and other varieties of “new thought” have been surpassed.
The distinction between the secular and the religious cannot be defended any longer: Nothing that is human is alien to religion.
The purpose of man’s life is the complete realization of the possibilities in human personality.
Humanists find their religious feelings expressed in an intensified sense of their personal lives and the cooperative effort to produce social well-being.
There are no uniquely religious emotions connected with the supernatural.
Man must discourage sentimental hopes and wishful thinking and face the challenges of life by embracing rational procedures.
Religious humanists aim to enhance the creative element in man in order to add to produce a more meaningful life.
All social associations should exist for the promotion of human flourishing.
A socialized cooperative economic system must be established for the fair distribution of the necessities of life to all human beings.
Religious humanists seek to affirm human life rather than deny it, seek to discover the full possibilities of life, not run from them, and aim to establish the conditions of a just and meaningful life for all, not just the privileged few.

For a complete statement of the theses, see Sellars (1970a, 331-335).

Some humanists declined to sign Manifesto I. Dr. Arthur Morgan stated several differences of emphasis, but also some more substantial objections (Wilson 1995, Ch. 7). Anticipating recent views in “deep ecology” (See Sessions, 1995), Morgan felt that Manifesto I placed too much emphasis on human life and failed to recognize that there may be significance in other life-forms. Morgan called for a “race of businessmen” which sees business as a public trust, not a means to personal enrichment, and he objected to the “unjustified cocksureness” of Manifesto I, feeling that it is “not dictated by humility or imagination”. Morgan also felt that though religion should be disciplined by science, it should not be limited by it. His most biting criticism was that many humanists are “not strong in faith, hope, and love.”

John Haynes Holmes, the prominent Unitarian minister and noted pacifist, declined to sign Manifesto I since he objected to the rejection of theism in the 6^th thesis, holding instead that a rational humanism “inevitably unfolds into a rational theism” (Wilson 1995,Ch. 7). He also found terms like “modernism,” in the 6^th thesis “hopelessly vague” and wondered why a humanist could not claim to represent the best of modernism. Although he found the deism of some of the authors “not half bad,” he insists that “Theism … is the blossom that grows on the plant of humanism, the poetry into which it unfolds in mystic beauty”.

Howard Shapley, a Harvard astronomer, spoke for many scientists who were reluctant to make judgments about religion: “As a social philosopher I am embryonic and I have decided that I should not misuse my position by pretending to intelligence or comprehension in a field in which my thoughts have been too scattered and probably too prejudiced” (Wilson 1995, Ch. 7). Although Shapley agrees with current traditions of protecting the weak, he is not sure that this is in keeping with “the biological traditions of the planet”. His point is not that the weak should not be protected, but that, as a scientist, he cannot claim to know this, and, therefore, he should not put his authority as a scientist behind the claim.

In his retrospective on Humanist Manifesto I, Wilson remarks that he now feels it to be a mistake to tie humanism directly to socialism. Humanism should not be tied to any particular economic system, but should concern itself with the more general goals of ending disease, poverty, ignorance, prejudice, and so forth (Wilson 1995, Ch. 18).

Later versions of the Manifesto found their own objections. Humanist Manifesto II found the language in Manifesto I to be “far too optimistic” about the possibility of eliminating social evils. Frances Schaeffer (2005) authored A Christian Manifesto (in opposition to the Communist Manifesto) which holds that both the humanist and communist Manifestos, despite significant differences between them, tend to foster similar forms of social degeneration. Schaeffer sees humanism as the unfortunate view that man is the measure of all things, and holds that even if that is not the humanist’s intention Manifesto I undermines the ideals of objective truth and morality. One major difference between Manifesto I and later humanist Manifestos and statements is that Manifesto I arose out of religious humanism (1918, Ch. XVI), and was, accordingly, much more sympathetic to religion per se than these later documents.

The objections by various humanists, both earlier and later, to signing Humanist Manifesto I show just how difficult it is to obtain agreement on such a central issue from such a diverse group of intellectuals representing different fields and backgrounds. Nevertheless, despite the various objections and reservations to Manifesto I, and the various replacement manifestos and declarations that appeared in later years, Manifesto I remains a significant historical document in the genesis of the humanist movement, and one that Sellars, who, it is probably fair to say, is “the principal author” of the published version, played an fundamental role in creating.

8. References and Further Reading

Several of Roy Wood Sellars’ works can be obtained in electronic form at The Internet Archive, The Autodidact Project and the online library of The Secular Web. Additional information on the various versions of the Humanist Manifestos and The Amsterdam Declaration is available online from the International Humanist and Ethical Union, the American Humanist Association, and the Council for Secular Humanism.

a. Primary Sources

Sellars, Roy Wood. 1902. “Re-interpretation of Democracy.” Inlander (University of Michigan publication), 12: 252-61.
Sellars, Roy Wood. 1907a. “The Nature of Experience.” Journal of Philosophy, Psychology and Scientific Methods: 14-18.
Sellars, Roy Wood. 1907b. “A Fourth Progression in the Relation of Body and Mind.” Psychological Review 14: 315-28.
Sellars, Roy Wood. 1907c. “Professor Dewey’s View of Agreement.” Journal of Philosophy, Psychology and Scientific Methods 4 (16): 315-28.
Sellars, Roy Wood. 1908a. “An Important Antinomy.” Psychological Review 15 (4): 237-249.
Sellars, Roy Wood. 1908b. “Consciousness and Conservation.” Journal of Philosophy, Psychology and Scientific Methods 5 (9): 235-38.
Sellars, Roy Wood. 1908c. “Critical Realism and the Time Problem I.” Journal of Philosophy, Psychology and Scientific Methods 5 (20): 542-48.
Sellars, Roy Wood. 1908d. “Critical Realism and the Time Problem II.” Journal of Philosophy, Psychology and Scientific Methods 5 (27): 597-602.
Sellars, Roy Wood. 1909a. “Causality.” Journal of Philosophy, Psychology and Scientific Methods 6: 323-28.
Sellars, Roy Wood. 1909b. “Space.” Journal of Philosophy, Psychology and Scientific Methods. 6: 617-23.
Sellars, Roy Wood. 1912. “Is There a Cognitive Relation?” Journal of Philosophy, Psychology and Scientific Methods 9 (9): 225-328.
Sellars, Roy Wood. 1915. “A Thing and its Properties.” Journal of Philosophy, Psychology and Scientific Methods 12 (12): 318-28.
Sellars, Roy Wood. 1916a. Critical Realism: A Study of the Nature and Conditions of Knowledge. Chicago: Rand-McNally and Co.
Sellars, Roy Wood. 1916b. The Next Step in Democracy. New York: Macmillan.
Sellars, Roy Wood. 1917a. The Essentials of Logic. Boston: Houghton Mifflin Co.
Sellars, Roy Wood. 1917b. The Essentials of Philosophy. New York: Macmillan.
Sellars, Roy Wood. 1918a. “An Approach to the Mind-Body Problem.” Philosophical Review 27 (2): 150-63.
Sellars, Roy Wood. 1918b. “On the Nature of Our Knowledge of the External World.” Philosophical Review 27 (5): 502-12.
Sellars, Roy Wood. 1918c. The Next Step in Religion. New York: Macmillan.
Sellars, Roy Wood. 1918d. “Review of P. Coffey, Epistemology, Journal of Philosophy,” Psychology and Scientific Methods 15: 557-8.
Sellars, Roy Wood. 1919a. “The Epistemology of Evolutionary Naturalism.” Mind 28 (112): 407-26.
Sellars, Roy Wood. 1919b. “Review of George Wobbermin, Christian Belief in God.” Journal of Philosophy, Psychology and Scientific Methods 16: 277-9.
Sellars, Roy Wood. 1920a. “The Status of Categories.” The Monist 30 (2): 220-39.
Sellars, Roy Wood. 1920b. “Space and Time.” The Monist 30 (3): 321-64.
Sellars, Roy Wood. 1920c. “Evolutionary Naturalism and the Mind-Body Problem.” The Monist 30 (4): 568-98.
Sellars, Roy Wood. 1920d. “Knowledge and Its Categories.” Essays in Critical Realism, R.W. Sellars, Durant Drake, A.O. Lovejoy, James Pratt, Arthur Rogers, George Santayana, (ed’s). New York: Macmillan: 187-219.
Sellars, Roy Wood. 1920e. “Review of J. A. Leighton, The Field of Philosophy.” Journal of Philosophy, Psychology and Scientific Methods 17: 79-81.
Sellars, Roy Wood. 1920f. “Preface.” to Evolution of Values, Helen Maud Sellars, (trans.). New York: Henry Holt.
Sellars, Roy Wood. 1921a. “Epistemological Dualism versus Metaphysical Dualism.” Philosophical Review 30 (5): 482-93.
Sellars, Roy Wood. 1921b. “The Requirement of an Adequate Naturalism.” The Monist 31 (2): 249-70.
Sellars, Roy Wood. 1922a. Evolutionary Naturalism. Chicago: Open Court.
Sellars, Roy Wood. 1922b. “Is Consciousness Physical?” Journal of Philosophy, Psychology and Scientific Methods 19 (25): 690-94.
Sellars, Roy Wood. 1922c. “Concerning ‘Transcendence’ and ‘Bifurcation'” Mind 31 (121): 31-39.
Sellars, Roy Wood. 1923a. “Le Cerveau, L’âme et La Conscience.” Bulletin de la Société Francais de Philosophie: 1-14.
Sellars, Roy Wood. January, 1923b (some sources say 1922). “The Double Knowledge Approach to the Mind-Body Problem.” Proceedings of the Aristotelian Society, N.S. 23: 55-70 (reprinted in Principles of Emergent Realism: 188-201).
Sellars, Roy Wood. 1924a. “The Emergence of Naturalism.” International Journal of Ethics 34 (4): 309-38.
Sellars, Roy Wood. 1924b. “Critical Realism and Its Critics.” Philosophical Review 33 (4): 379-97.
Sellars, Roy Wood. 1926a. The Principles and Problems of Philosophy. New York: Macmillan.
Sellars, Roy Wood. 1926b. “Cognition and Valuation,” Philosophical Review 35 (2): 124-44.
Sellars, Roy Wood. 1927a. “Realism and Evolutionary Naturalism: A Reply to Professor Hoernlé.” The Monist. 37 (1): 150-55.
Sellars, Roy Wood. 1927b. “Current Realism in Great Britain and the United States.” The Monist 37 (4): 503-520.
Sellars, Roy Wood. 1927c. “What is the Correct Interpretation of Critical Realism?,” Journal of Philosophy, Psychology and Scientific Methods 24 (9): 238-241.
Sellars, Roy Wood. 1927d. “Why Naturalism and Not Materialism?,” Philosophical Review 36 (3): 215-25.
Sellars, Roy Wood. 1928a. Religion Coming of Age. New York: Macmillan.
Sellars, Roy Wood. 1928b. “Current Realism in Great Britain and the United States.” Philosophy Today Edward L. Schaub, (ed.). Chicago and London (reprint from The Monist, 1927): 19-36.
Sellars, Roy Wood. 1929a. “Current Realism.” Anthology of Recent Philosophy D. S. Robinson, (ed.). New York: Thomas Y. Crowell Co. (re-print from Philosophy Today): 279-290.
Sellars, Roy Wood. 1929b. “A Re-examination of Critical Realism.” Philosophical Review 38 (5): 439-55.
Sellars, Roy Wood. 1929c. “Critical Realism and Substance.” Mind 38 (152): 473-88. Reprinted in Ruth Goff, (ed.). 2008. Revitalizing Causality: Realism About Causality in Philosophy and Social Science. New York: Routledge: 13-25.
Sellars, Roy Wood. 1930a. “A Naturalistic Interpretation of Religion.” The New Humanist 3 (4): 1-4.
Sellars, Roy Wood. 1930b. “Realism, Naturalism and Humanism.” in Contemporary American Philosophy G. P. Adams and W. P. Montague, v. 2. (eds.), New York: Macmillan: 261-85.
Sellars, Roy Wood. 1931. “Humanism, Viewed and Reviewed.” The New Humanist 4 (15): 12-16.
Sellars, Roy Wood. 1932a. The Philosophy of Physical Realism. New York: Macmillan.
Sellars, Roy Wood. 1932b. “Reinterpretation of Relativity.” Philosophical Review 41 (5): 517-18.
Sellars, Roy Wood. 1933a. “L’Hypothèse de l’Émergence.” Revue de Métaphysique et de Morale 40 (3): 309-24.
Sellars, Roy Wood. 1933b. “Religious Humanism.” The New Humanist 6 (3): 7-12.
Sellars, Roy Wood (Drafter and co-signer). May-June, 1933c. “Humanist Manifesto.” The New Humanist 6 (3): 58-61.
Sellars, Roy Wood. 1933d. “In Defense of the Manifesto.” The New Humanist 6 (6): 6-12.
Sellars, Roy Wood. 1933e. “Review of Durant Drake, Introduction to Philosophy.” Journal of Philosophy, Psychology and Scientific Methods 3: 667-9.
Sellars, Roy Wood. 1934. “Nature and Naturalism.” The New Humanist 7 (2): 1-8.
Sellars, Roy Wood. 1935a. “Review of C. F. Gauss, Primer for Tomorrow.” Michigan Alumnus Quarterly Review. 41: 465-6.
Sellars, Roy Wood. 1935b. “George S. Morris.” Dictionary of American Biography 13: 208-9.
Sellars, Roy Wood. 1937a. “Critical Realism and the Independence of the Object.” Journal of Philosophy, Psychology and Scientific Methods 34 (20): 541-550.
Sellars, Roy Wood. 1937b. “Henry Philip Tappan.” Dictionary of American Biography 18: 302-3.
Sellars, Roy Wood. 1938a. “An Analytic Approach to the Mind-Body Problem.” Philosophical Review 47 (5): 461-87.
Sellars, Roy Wood. 1938b. “A Statement of Critical Realism.” Revue Internationale de Philosophie 3: 472-496.
Sellars, Roy Wood. 1939a. “Positivism in Contemporary Philosophical Thought.” American Sociological Review: 26-42.
Sellars, Roy Wood. 1939b. “A Clarification of Critical Realism.” Philosophy of Science 6 (4): 412-92.
Sellars, Roy Wood. 1940. “Knowledge and its Categories.” The Development of American Philosophy, W. G. Muelder and Laurence Sears, (ed’s). Cambridge, Mass: 431-40 (Reprinted from Drake, Durant. 1920. Essays in Critical Realism. New York: Gordian Press: 187-219)
Sellars, Roy Wood. 1941a. “Humanism as a Religion.” The Humanist 1 (1): 5-8.
Sellars, Roy Wood. 1941b. “A Correspondence Theory of Truth.” Journal of Philosophy, Psychology and Scientific Methods 38 (24): 653-54.
Sellars, Roy Wood. 1942a. “Aspects of Democracy II: the Quality of Democracy.” Michigan Alumnus Quarterly Review 48: 98-103.
Sellars, Roy Wood. 1942b. “Galileo Galilei.” Michigan Alumnus Quarterly Review 48: 301-7.
Sellars, Roy Wood. 1942c. “Review of E. Gilson, God and Philosophy.” The Humanist 2: 36-7.
Sellars, Roy Wood. 1942-43. “Dewey on Materialism.” Journal of Philosophy and Phenomenological Research 3 (4): 381-92.
Sellars, Roy Wood. 1943a. “Science , Philosophy, and Religion.” The Humanist 3: 84-5.
Sellars, Roy Wood. 1943b. “Verification of Categories: Existence and Substance” Journal of Philosophy 40 (8): 197-205.
Sellars, Roy Wood. 1943c. “Causality and Substance.” Philosophical Review.”. 52 (1): 1-27 (Reprinted in Ruth Goff, (ed.). 2008. Revitalizing Causality: Realism About Causality in Philosophy and Social Science. New York: Routledge: 26-45).
Sellars, Roy Wood. 1943d. “Reason and Revolution,” Michigan Alumnus Quarterly Review 49: 212-14.
Sellars, Roy Wood. 1943e. “Review of J. Maritain, Education at the Cross Roads.” The Humanist 3: 165-70.
Sellars, Roy Wood. 1944a. “Causation and Perception.” Philosophical Review 53 (6): 534-56.
Sellars, Roy Wood. 1944b. “Reformed Materialism and Intrinsic Endurance.” Philosophical Review. 53: 359-82.
Sellars, Roy Wood. 1944c. “Is Naturalism Enough?” Journal of Philosophy, Psychology and Scientific Methods 41 (September): 533-44.
Sellars, Roy Wood. 1944d. “Does Naturalism Need Ontology?” Journal of Philosophy, Psychology and Scientific Methods 41 (25): 686-94.
Sellars, Roy Wood. 1944e. “Can a Reformed Materialism Do Justice to Values?” Ethics 55 (1): 28-45.
Sellars, Roy Wood. 1944-45a. “The Meaning of True and False.” Journal of Philosophy and Phenomenological Research 5 (1): 98-103.
Sellars, Roy Wood. 1944-45b. “Reflections on Dialectical Materialism.” Journal of Philosophy and Phenomenological Research 5 (2): 157-79.
Sellars, Roy Wood. 1944-45c. “Knowing and Knowledge.” Journal of Philosophy and Phenomenological Research 5 (3): 341-344.
Sellars, Roy Wood. 1944-45d. “Knowing through Propositions.” Journal of Philosophy and Phenomenological Research 5 (3): 348-9.
Sellars, Roy Wood. 1945-46. “Review of Yervant Krikorian, Naturalism and the Human Spirit.” Journal of Philosophy and Phenomenological Research 6: 436-9.
Sellars, Roy Wood. 1946a. “A Note on the Theory of Relativity.” Journal of Philosophy, Psychology and Scientific Methods 43 (12): 309-17.
Sellars, Roy Wood. 1946b. “Materialism and Relativity: A Semantic Analysis.” Philosophical Review 55 (1): 25-51.
Sellars, Roy Wood. 1946c. “Philosophy and Physics of Relativity.” Philosophy of Science 13 (3): 177-95.
Sellars, Roy Wood. 1946-47. “Positivism and Materialism.” Journal of Philosophy and Phenomenological Research 7 (1): 12-40.
Sellars, Roy Wood. 1947. “Accept the Universe as a Going Concern.” Religious Liberals Reply Henry Wieman, (ed.). Boston: Beacon Press.
Sellars, Roy Wood. 1948a. “Do the Natural Sciences Have a Need of the Social Sciences?,” Philosophy of Science 15 (2): 104-8.
Sellars, Roy Wood. 1948b. “Naturalistic Humanism.” Religion in the Twentieth Century Vergilius Ferm, (ed.). New York: Littlefield and Adams (later edition date 1958): 415-31.
Sellars, Roy Wood. 1948c. “Review of A. N. Whitehead, Essays in Science and Philosophy.” The Humanist 8: 92-3.
Sellars, Roy Wood. 1949a. “Social Philosophy and the American Scene.” Philosophy for the Future R. W. Sellars, V. J. McGill, and M. Farber, (ed.’s). New York: Macmillan: 61-75.
Sellars, Roy Wood. 1949b. “Materialism and Human Knowing.” Philosophy for the Future R. W. Sellars, V. J. McGill, and M. Farber, (ed’s). New York: Macmillan: 75-106.
Sellars, Roy Wood. 1949c. “Resume of W. Cook Foundation Lectures.” (delivered by Ralph Barton Perry), Michigan Alumnus Quarterly Review 55: 185-94.
Sellars, Roy Wood, McGill, V.J., Farber, Marvin. 1949. Forward to Philosophy for the Future, R. W. Sellars, V. J. McGill, and M. Farber, (ed’s). New York: Macmillan: v-xii.
Sellars, Roy Wood. 1949-50. “Review of Leslie A. White, The Science of Culture.” Journal of Philosophy and Phenomenological Research 10: 586-7.
Sellars, Roy Wood. 1950a. “Critical Realism and Modern Materialism.” Philosophical Thought in France and the United States, Marvin Farber, (ed.). Buffalo: The University of Buffalo Publications: 463-82.
Sellars, Roy Wood. 1950b. “The New Materialism.” A History of Philosophical Systems V. Ferm, (ed.). New York: Philosophical Library: 418-28.
Sellars, Roy Wood. 1950c. “Review of Frank Chapman Sharp, Good Will and Ill Will,” The Humanist. 10: 277-8
Sellars, Roy Wood. 1950d. “Review of Leslie A. White, The Science of Culture.” Michigan Alumnus Quarterly Review 56: 175-6.
Sellars, Roy Wood. 1950-51. “The Spiritualism of Lavelle and Le Senne.” Journal of Philosophy and Phenomenological Research 11 (3): 386-93.
Sellars, Roy Wood. 1951. “Professor Goudge’s Queries with Respect to Materialism.” Philosophical Review 60 (2): 243-8.
Sellars, Roy Wood. 1951-52a. “Review of R. W. Boynton, Beyond Mythology.” Journal of Philosophy and Phenomenological Research 12: 146-8.
Sellars, Roy Wood. 1951-52b. “Review of Charles Mayer, “Man: Mind or Matter.” Journal of Philosophy and Phenomenological Research 12: 436-42.
Sellars, Roy Wood. 1952. “Le spiritualisme de Louis Lavelle et de René le sense.” Les Études Philosophiques 9(1/2): 30-40.
Sellars, Roy Wood. 1955. “My Philosophical Position: A Rejoinder.” Journal of Philosophy and Phenomenological Research 16 (1): 72-97.
Sellars, Roy Wood. 1956a. “Physical Realism and Relativity: Some Unfinished Business.” Philosophy of Science 23 (2): 75-81.
Sellars, Roy Wood. 1956b. “Gestalt and Relativity: An Analogy.” Philosophy of Science 23 (4): 275-279.
Sellars, Roy Wood. 1957a. “Guided Causality, Using Reason and ‘Free Will’.” Journal of Philosophy 54 (August): 485-93.
Sellars, Roy Wood. 1957b. “Philosophical Orientation and Peace.” The Idea of War and Peace in Contemporary Philosophy Irving Louis Horowitz, (ed.). New York: vii-xx (The book was re-released by Literary Licensing Publisher in 2012).
Sellars, Roy Wood. 1959a. “Levels of Causality: The Emergence of Guidance and Reason in Nature.” Journal of Philosophy and Phenomenological Research 20 (1): 1-17.
Sellars, Roy Wood. January, 1959b. “Sensations as Guides to Perceiving.” Mind 68 (January): 2-15.
Sellars, Roy Wood. 1959c. “‘True’ as Contextually Implying Correspondence.” Journal of Philosophy 56 (18): 712-22.
Sellars, Roy Wood; Lamont, Corliss; Otto, Max; Huxley, Julien; Williams, Gardner; Randall Jr; John Herman. 1959. A Humanist Symposium on Metaphysics. Journal of Philosophy 56 (2): 45-64.
Sellars, Roy Wood. October 1960. “Panpsychism or Evolutionary Materialism.” Philosophy of Science 27 (October): 229-50.
Sellars, Roy Wood. 1961a. “Referential Transcendence.” Journal of Philosophy and Phenomenological Research 22 (1): 1-15.
Sellars, Roy Wood. 1961b. “Querying Whitehead’s Framework.” Revue Internationale de Philosophie 56-57: 135-66.
Sellars, Roy Wood. 1962. “American Critical Realism and British Theories of Sense Perception I and II.” Methodos: 61-108.
Sellars, Roy Wood. 1963. “Direct, Referential Realism.” in Dialogue 2 (02): 135-43.
Sellars, Roy Wood. 1965. “Existentialism, Realistic Empiricism, and Materialism.” Journal of Philosophy and Phenomenological Research 25 (3): 315-32.
Sellars, Roy Wood. 1968. Lending a Hand to Hylas. Ann Arbor: Edward Brothers.
Sellars, Roy Wood. 1969a. “A Possible Integration of Science and Philosophy,” Zygon 4 (3): 293-97
Sellars, Roy Wood. 1969b. “Some Questions and Suggestions: An Exposition,” Journal of Philosophy. 66 (24): 859-60
Sellars, R.W. 1969c. “Le naturalisme de Sellars,” Dialectica 23 (1): 79-80
Sellars, Roy Wood. 1969d. Reflections on American Philosophy from Within. Notre Dame: University of Notre Dame Press.
Sellars, Roy Wood. 1970a. Principles of Emergent Realism. W. Preston Warren, (ed.) . St. Louis: Warren H. Green.
This book is really the best place to obtain an overview of R.W. Sellars’ writings with both extensive primary sources and commentary over the course of his development.
Sellars, Roy Wood. 1970b. Social Patterns and Political Horizons. Nashville: Aurora Publishers.
Sellars, Roy Wood. 1970c. Principles, Perspectives, and Problems of Philosophy. New York: Pageant Press International Corp.
Sellars, Roy Wood. 1973a. Neglected Alternatives: Critical Essays by Roy Wood Sellars. William. Preston Warren, (ed.), Lewisburg: Bucknell University Press.
Sellars, Roy Wood. 1973b. January-February. “Toward a New Humanist Manifesto.” The Humanist
Sellars, Roy Wood. 1973c. “Recollections of Marvin Farber.” In Phenomenology and Natural Existence, Dale Riepe, (ed.). Albany: State University of New York Press.
Sellars, Roy Wood. 1975. Forward to William Preston Warren. Roy Wood Sellars. Boston: Twayne.
Sellars, Roy Wood. 1991. “Philosophy of Organism and Physical Realism”. The Philosophy of Alfred North Whitehead. Paul A. Schlipp, (ed.). LaSalle: Open Court: 407-433 (Original publication date, 1941).

b. Secondary Sources

Avery, Jon Henry. 1989. “An Analysis and Critique of Roy Wood Sellars’ Descriptive and Normative Theories of Religious Humanism.” PhD diss., The Iliff School of Theology and University of Denver.
Bahm, Archie, 1954. “Evolutionary Naturalism.” Philosophy and Phenomenological Research 15 (1): 1-12.
Baker, Richard R. 1950. “The Naturalism of Roy Wood Sellars,” New Scholasticism. 24 (1): 3-31.
Beckermann, Angsar, Flohr, Hans, Kim, Jaegwon. 1993. Emergence or Reduction: Essays on the Prospects of Non-Reductive Physicalism. Berlin: De Gruyter.
Benjamin, Cornelius. 1934. Book Review: “The Philosophy of Physical Realism.” Roy Wood Sellars. Ethics. 44 (2): 270.
Bergson, Henri. 1998. Creative Evolution. Arthur Mitchell, (trans.). New York: Dover.
Blau, Joseph, 1952. Men and Movements in American Philosophy. New York: Prentice-Hall.
Blitz, David. 2010. Emergent Evolution and Creative Novelty. New York: Springer.
Bogomolov, A.S. 1962. “Roy Wood Sellars in the Materialist Theory of Knowledge,” Russian Studies in Philosophy. 1 (3): 31-32.
Chisholm, Roderick, 1955. “Critical Realism,” Philosophy and Phenomenological Research. 15 (1): 33-47.
Davies, Paul, Clayton, Philip, (ed’s). 2008. The Re-Emergence of Emergence: The Emergentist Hypothesis from Science to Religion. Oxford: Oxford University Press.
Delaney, C. F., 1969. Mind and Nature: A Study of the Naturalistic Philosophy of Cohen, Woodbridge and Sellars. Notre Dame: University of Notre Dame Press.
Delaney, C. F. 1971. “Sellars and the Contemporary Mind-Body Problem,” The New Scholasticism 45: 245-68.
Emmet, Dorothy. 1932. Whitehead’s Philosophy of Organism. London: Macmillan.
Ferm, Vergilius. 1950. “Varieties of Naturalism,” A History of Philosophical Systems. V. Ferm, (ed.). New York: Philosophical Library: 429-441.
Frankena, William. 1954. “Theory of Valuation,” Philosophy and Phenomenological Research. 15 (1): 65-81.
Frankena, William. Dec. 1973. “Roy Wood Sellars: Obituary,” Philosophy and Phenomenological Research 34 (2): 300-301.
Frankena, William. 1973-74. “Roy Wood Sellars: Memoriam,” Proceedings and Addresses of the American Philosophical Association 47: 230-232.
Gluck, Samuel E. 1971. Review of Norman Paul Melchert’s Realism, Materialism, and the Mind: The Philosophy of Roy Wood Sellars. Springfield, Illinois: Charles C. Thomas. Philosophy 46 (177): 281ff.
Grayling, A.C. 2003. Meditations for the Humanist: Ethics for a Secular Age. Oxford: Oxford University Press.
Griffin, James Phillip, 1966. “Foundations of Ethical Value in the Philosophy of Roy Wood Sellars and William Temple.” PhD diss., Boston University.
Hasker, William. 2001. The Emergent Self. Ithaca: Cornell University Press.
Herbert, David. 1994. “A New Critical Realism: An Examination of Roy Wood Sellars’ Epistemology,” Transactions of the Charles Sanders Peirce Society 30 (3): 477 – 514.
Hoor, Marten. 1954. “Humanism as a Religion,” Philosophy and Phenomenological Research 15 (1): 84-97.
Hudson, Yeager. 1965. “Metaphysical Causality in the Philosophies of Brand Blanshard, Roy Wood Sellars, and John Laird.” PhD diss., Boston University.
Iobst, Philip Kirschman. 1975. The Normative Philosophy of Roy Wood Sellars: A Critical Examination, Ph. D. Dissertation, State University of New York at Buffalo
Kreyche, Robert. 1951. “The Naturalism of Roy Wood Sellars.” PhD diss., University of Ottawa.
Kuiper, John. 1954 (some references say 1955). “The Mind-Body Problem,” Philosophy and Phenomenological Research 15 (September): 46-84.
Kurtz, Paul. 1973. Humanist Manifestos I and II. Amherst, NY: Prometheus Books.
Kurtz, Paul. 1981. “The Arrogance of Humanism,” International Studies in Philosophy 13 (1):91-93.
Kurtz, Paul. 1983. A Secular Humanist Declaration. Amherst, NY: Prometheus Books.
Kurtz, Paul. 2000. Humanist Manifesto 2000: A Call for a New Planetary Humanism. Amherst, NY: Prometheus Books.
Levine, Steven. 2007. “Sellars’ Critical Direct Realism,” International Journal of Philosophical Studies 15 (1): 53-76.
Kurtz, Paul. 2007. What is Secular Humanism? Amherst, NY: Prometheus Books.
Lamont, Corliss. 1997. The Philosophy of Humanism. Washington, D.C: Humanist Press.
McDonough, Richard. 2002. “Emergence and Creativity: Five Degrees of Freedom,” Creativity, Cognition, and Knowledge Terry Dartnall, (ed.). Westport, Connecticut: Praeger: 283-321.
Melchert, Norman Paul. 1964. “An Examination of the Physical Realism of Roy Wood Sellars.” PhD diss., University of Pennsylvania.
Melchert, Norman Paul. 1968. Realism, Materialism, and the Mind: The Philosophy of Roy Wood Sellars. Springfield, Ill.: Charles C. Thomas.
Munk, Arthur W. P. 1945. “Roy Wood Sellars’ Criticism of Idealism.” PhD diss., Boston University.
Ramsperger, A.G. 1967. “Critical Realism” Encyclopedia of Philosophy, v. 2. Paul Edwards., (ed.) (New York: Macmillan and the Free Press: 262-263.
Reck, Andrew. 1962. Recent American Philosophy: Studies of Ten Representative Thinkers. New York: Pantheon Books.
Reck, Andrew. 1971. “The Realism of Roy Wood Sellars,” The New Scholasticism. 45 (2): 209-44.
Sellars, Wilfrid. 1949. “Aristotelian Philosophies of Mind”. Philosophy for the Future, Roy Wood Sellars, V.J. McGill, Marvin Farber, (ed’s). New York: Macmillan: 544-570.
Sellars, Wilfrid. 1955. “Physical Realism,” Philosophy and Phenomenological Research 15 (1): 13-32.
Sellars, Wilfrid, and Meehl, Paul. 1956. “The Concept of Emergence,” Minnesota Studies in the Philosophy of Science, v. 1. Minneapolis: University of Minnesota Press: 239-252.
Sellars, Wilfrid. 1965. “The Identity Approach to the Mind-Body Problem,” Review of Metaphysics 18 (March): 430-51.
Sellars, Wilfrid, 1971. “The Double-Knowledge Approach to the Mind-Body Problem,” The New Scholasticism. 45 (2): 269-89. (Note that Roy had published an article with the same title in 1923)
Sellars, Wilfrid. 1991. “Philosophy and the Scientific Image of Man,” Science, Perception and Reality Atascadero, California: Ridgeview: 1-40.
Schaeffer, Francis. 2005. A Christian Manifesto. Wheaton, Illinois: Crossway Books.
Sessions, George. 1995. Deep Ecology for the Twenty-First Century. Boston: Shambhal.
Slurink, Pouwel. 1996. “Back to Roy Wood Sellars: Why His Evolutionary Naturalism is Still Worthwhile,” Journal of the History of Philosophy 34 (3):425-44.
Rowntree, Clifford. 1964. “Direct, Referential Realism: A Comment,” Dialogue 2 (04): 452-453.
Shoemaker, Sydney. 2002. “Emergence,” Philosophical Studies 58 (1-2): 53-63.
Trelo, Virgil J. 1966. The Critical Realism of Roy Wood Sellars. Lisle, Ill.: St. Procopius College.
Vintiadis, Elly. “Emergence,” Internet Encyclopedia of Philosophy
Warren, William Preston. 1967. “Realism 1900-1930: An Emerging Epistemology,” The Monist. 51 (2): 179-205.
Warren, William Preston. 1970. Introduction to Roy Wood Sellars. Principles of Emergent Realism, W. Preston Warren, (ed.). St. Louis: Warren H. Green: xi-xxiv.
Warren, William Preston, 1972a. “Foundations of Philosophy,” Bucknell Review 19 (3): 69-100.
Warren, William Preston. 1972b. “Experimentalism Plus,” Philosophy and Phenomenological Research 33 (2): 149-82.
Warren, William Preston. 1973a. “A Brief Biography of Roy Wood Sellars,” Neglected Alternatives: Critical Essays by Roy Wood Sellars. W. Preston Warren, (ed.). Lewisburg: Bucknell University Press: 19-22.
Warren, William Preston. 1973b. Preface to Roy Wood Sellars’ Neglected Alternatives. Lewisburg: Bucknell University Press: 7-15.
Warren, William Preston. 1975. Roy Wood Sellars. Boston: Twayne.
This book is probably the best sympathetic secondary source on R.W. Sellars’ views.
Werkmeister, W. H. 1981. History of Philosophical Ideas in the United States. New York: Ronald Press.
Warren, William Preston. 2007. “Roy Wood Sellars: Philosopher of Religious Humanism,” Notable American Unitarians, Herbert Vetter, (ed.). Cambridge, Harvard Square Library: 211-213.
Wilson, Edwin. 1995. The Genesis of a Humanist Manifesto. Amherst, NY. Humanist Press.
Wood, Ledger. 1950. “Recent Epistemological Schools,” A History of Philosophical Systems, V. Ferm, (ed.). New York: Philosophical Library: 516-539.
Wright, Edmund. 1994. “A New Critical Realism: An Examination of the Critical Realism of Roy Wood Sellars,” Transactions of the Charles Sanders Pierce Society 30 (3): 477-514.

Author Information

Richard McDonough
Email: rmm249@cornell.edu
Arium School of Arts & Sciences
Singapore

African Sage Philosophy

The Sage Philosophy Project began in the mid-1970s at the Department of Philosophy of the University of Nairobi Kenya. At the University, Henry Odera Oruka (1944-1995) popularized the term “Sage Philosophy Project,” and closely related terms such as “philosophic sagacity,” both by initiating a project of interviewing African sages, and by naming this project in a widely read popular article as the most promising of four trends of the relatively new field of African philosophy.

This encyclopedia article focuses primarily on Oruka and his immediate sources of inspiration, and then includes others whose projects share similar methodologies and goals.

Although the definition of the key terms is not always completely uniform, at the heart of this approach to African philosophy lies the emphasis on academically-trained philosophy students and professors interviewing non-academic wise persons whom Oruka called “sages,” and then engaging philosophically with the interview material. Oruka usually (but not always) emphasized keeping the identity of the individual sage well known. He also insisted that it was the sage who knew the traditions of his or her ethnic group the best, and who would be able to have critical distance to evaluate and sometimes reject prevailing beliefs and practices. The goals of collecting the interviews and evaluating them have been articulated in Oruka’s many works. The first goal was to help construct texts of indigenous African philosophies. Before Oruka’s project there was a dearth of existing texts and a need to record indigenous ideas, both for posterity (that is, for a sense of identity and for historical reasons) and for the present and future. African wisdom that had been marginalized by academia, and by city life, could provide valuable solutions to contemporaneous problems in Africa. Such texts of interviews could also sustain intellectual curiosity and provide practical guidance (or phronesis).

Oruka searched for sages and wanted a wider public to know not only their words (written down in transcripts) but also about their lives. For him, a sage’s worth was not only in their ideas but also in the way they live: by embodying their philosophies, developing their character, and affecting their communities over the years. After all, the sages in Kenya operate in contexts of social conflict and exploitation. Sages are those from whom others seek moral and metaphysical advice and consultation on issues involving moral and psychological attitudes and judgments.Oruka looked to the term japaro in Luo, meaning “thinker,” to approximate the translation of sage. The term japaro is closely related to jang’ ad rieko which means “professional advisor.” He emphasized that people would single out sages for advice on even the most delicate matters.

Oruka Biography and Early Writings
Sage Philosophy in Philosophical Context
Beginning Interviews in Kenya
Relationship to the Hallen-Sodipo Study
Folk Sages and Philosophic Sages
Criticisms of Sage Philosophy
Culture Philosophy and Its Relationship to Philosophic Sages
Oruka’s Sage Philosophy: the Last Few Years
Sage Philosophy Research by Other Philosophers: Students
Sage Philosophy Research by Other Philosophers: Other Scholars
References and Further Reading

1. Oruka Biography and Early Writings

The history of the project begins in the 1970s; nevertheless, it is important to understand the project’s beginning in the context of its immediate precursors, both those that served as partial models and those that served as negative examples of what must not be done. It is also important to know something about Oruka’s academic training and background, and the skills and interests he brought to the project.

Oruka grew up surrounded by sages in his home area of Ugenya, in the Nyanza Province of Kenya, and as a youth he looked up to them and learned much wisdom from them. Graduating from St. Mary’s High School in Yala, he won a scholarship to study geography at Uppsala University in Sweden. While there, Oruka was influenced by philosophy Professor Ingemar Hedenius to follow his newly developing interests and study philosophy instead. Philosophy studies at Uppsala were divided into two tracks, Practical and Theoretical, and Oruka specialized in Practical Philosophy: Applied Ethics and Political Philosophy. The approach to philosophy Oruka learned both in Sweden and later at Wayne State University in Detroit, Michigan, was greatly influenced by the logical empiricists. Indeed, Oruka referred to himself an empiricist as well (Practical 283). He would later remark that this narrow emphasis on analytic philosophy that he received in his formal training was an initial “handicap” to his ability to enter the debates on African philosophy upon his return to Kenya (Oruka, Trends 127).

When he returned to Kenya in 1970, Oruka became one of the first two African philosophy faculty members at University of Nairobi. At that time, many departments at the University of Nairobi (UON) were questioning the Eurocentric curriculum that was their colonial heritage. Ngugi wa Thiong’o, Okot p’Bitek, and Taban Lo Liyong were some of the scholars challenging the curriculum in literature, development studies, and other areas (Ogot). The Institute for African Studies at UON was founded in 1970. Sage philosophy was an attempt to rise to the challenge of imagining an approach to philosophy that focused on African ideas and realities. The fields of literature and history had turned to oral sources; there was no reason that philosophy could not do the same.

When Oruka received his first full-time position in 1970, the field of African Philosophy was dominated by Placide Tempels, John Mbiti, and other early scholars who sometimes blurred the line between religious and philosophical thinking. Also, at that time, the Philosophy and Religious Studies departments at UON were merged. Having studied with Hedenius, famous for his arguments in favor of atheism, Oruka distinguished himself with early essays in 1972 and 1975 denouncing much of what was passing for “African philosophy” as no more than dressed-up mythical thinking. (He later judged these articles as “youthful” as well as “simplistic and unnecessarily offensive” Oruka, Trends 12, n.1; 125-29; Practical 285; Graness and Kresse 12). He championed a secular and logical approach to life’s big questions. However, also impressed by the need to appreciate an unfairly-marginalized, substantial body of thought coming from Africa, Oruka proposed his “sage philosophy” project as a way to provide missing information about African ideas and values. He was convinced that rural sages were not merely “religious figures” but thinkers who used their own rational powers to develop insights, and who could explain their reasoning to others.

In his early 1972 article “Mythologies as African Philosophy” Oruka was to insist on jettisoning traditions harmful to Africa’s present and future. He criticized both Placide Tempels’ book Bantu Philosophy and John Mbiti’s book African Religions and Philosophy as backward-looking champions of absolutely unphilosophical African traditions. He agreed with Fanon’s criticism of a certain type of misguided African intellectual who falsely builds up the greatness of African tradition in a futile attempt to convince Europeans that African culture is as good as theirs. Oruka wanted instead to write for an African audience (Oruka in Graness and Kresse Sagacious 1999 ed., 23).

In “Mythologies,” Oruka began to articulate his emphasis on the need to acknowledge individual thinkers. By anonymizing everyone and providing only group consensus, Tempels, Mbiti, and W. E. Abraham (author of The Mind of Africa) presented “philosophy without philosophers.” He suggested, “We can as well start afresh by interviewing sage Africans and eliciting philosophical expositions from them” (Oruka in Graness and Kresse Sagacious 1999 ed., 30). While individuals’ thinking is influenced by their community and material conditions, they are not determined by them, and in fact individuals can also influence groups. Oruka also pointed out that a philosopher’s role is not just to describe how people think and act, but to make suggestions as to how they ought to think and act (Oruka in Graness and Kresse Sagacious 1999 ed., 31).

2. Sage Philosophy in Philosophical Context

Oruka conceived of the project in relation to interjections from Kwasi Wiredu and Paulin Hountondji, whom he had met and who had both been invited to University of Nairobi. He had become familiar with their written works in early philosophy journals published in Africa, such as Second Order (University of Ife Press, Nigeria), Universitas (Accra), and Cahiers philosophiques africains/African philosophical journal (Zaire) (Oruka, Trends 129-30, 132-33). Both scholars had studied philosophy in African universities and abroad, Wiredu at University College, Oxford, and Hountondji at the École Normale Supérieure, and both were critical of the ethnophilosophical approaches of Tempels and Mbiti.

Wiredu, based in Ghana, emphasized the secular and rational nature of much ethical thought among the Akan groups in Ghana. He outlined three major hindrances to African cultural regeneration: anachronism, authoritarianism, and supernaturalism. But he also insisted that Africa had very wise and philosophical persons from whom a lot could be learned, especially if one paid attention to the nuances of concepts in African languages. In a 1972 issue of Second Order, Wiredu wrote that “it is a particular (though not exclusive) responsibility of African philosophers to research into their traditional background of philosophical thought” (“On an African Orientation” 12). However, he argued, while traditional concepts and codes of conduct should be an area of study, they should not lead to anachronism—an attempt to turn back the hands of time or cling to the days of yesteryear (7).

Wiredu was the first to label “what ‘our elders’ said” as “folk philosophies.” He found exciting the prospect of constructing, from “the living wise men of the tribe,” “the elaborate and argumentative reasons” behind the belief systems and moral guidelines of “our philosophers of old.” Still, the resulting material could not, Wiredu believed, help to tackle most modern problems in Africa (“On an African Orientation” 5). Along with interest in past traditions, he maintained, scientific method and clear argumentation were necessary to guide African youths in confronting the new moral dilemmas facing contemporary African society. Barry Hallen, scrutinizing Wiredu’s article, says that Wiredu intended the phrase “folk philosophies” to refer to unreasoned beliefs whether they were African or Western (Hallen “Yoruba” 106-08). Wiredu followed up this exploration with an article that Oruka recommended to his readers, in which Wiredu compared and contrasted the meaning of “philosopher” and “wise man.” The material, first published in the article (Wiredu “What Is”), was later incorporated in Wiredu’s book (Wiredu Philosophy 139-173; see Oruka Trends 69n5).

Three years later (1975), in Second Order, Oruka explained that he and others at UON were already engaged in a project along the lines of Wiredu’s description. He said, “We are seeking to unsheathe, through constant contacts and discussions with those concerned, the elaborate philosophical views and reasons from the living traditional Kenyan thinkers and sages” (Oruka “The Fundamental” 54n6). He followed Wiredu’s words and ideas closely enough to repeat the descriptors “elaborate” and “reasons.” In his subsequent book he adopted the descriptors “folk philosophies” and “folk sage,” but clarified that, in addition to elders who are examples of folk sagacity, there were some philosophic sages able to scrutinize prevailing beliefs and give sustained arguments for their positions. The elders, he asserted, were more than just depositories of outdated folk wisdom. Philosophical sages were able both to describe the “culture philosophy” held by most members of their community and also to evaluate the content (or at least understand the genesis) of such culture philosophies. In Philosophy and an African Culture (1980) Wiredu affirmed that “The recording and critical study of the thought of individual indigenous thinkers is worthy of the most serious attention of contemporary African philosophers” (37). In Cultural Universals and Particulars (1996), Wiredu wrote that Oruka’s sage philosophy book was the first to give “substantial notice” to individual philosophical thinkers in Africa (116).

Paulin Hountondji was another key influence on the development of sage philosophy. Hountondji gave a talk, “Philosophy and Its Revolutions,” at the National University of Zaire during “Special Philosophy Days” in June 1973, and a second time at University of Nairobi in November, 1973. Invitations for these talks came from the Philosophical Association of Kenya, which Oruka had founded (African 71).A paper based on the talks was published in French in 1973 in Cahiers philosophiques africains/African Philosophical Journal and later incorporated into Hountondji’s book, African Philosophy: Myth and Reality (71-108). Hountondji’s “Revolution” article, and chapter, which Oruka and other Kenyans heard in person in 1973, criticized Tempels’ book Bantu Philosophy but appreciated the works of two European anthropologists, Paul Radin and Marcel Griaule, suggesting that their approach was much more careful than Tempels’. In fact, Hountondji said, Tempels’ study was “behind the anthropology of the time” (African 76). Twenty years earlier than Tempels, Radin wrote Primitive Man as Philosopher, a study of philosophy in Africa that focused on original thinkers who were members of an intellectual class in their communities. Hountoundji explained that Radin denounced the prejudice that African individuals are submerged in unitary group-think and took it upon himself to transcribe faithfully what members of this intellectual class told him (African 76; “La Philosophie” 30-31).

Paul Radin was an anthropologist originally from Poland who had studied with Franz Boas at Columbia University. Radin recorded interviews with members of a Native American community from Nebraska called the Winnebago. He explained in his book the necessity of researchers presenting “statements made by the Winnebago” word-for-word to the public, rather than merely recounting others’ ideas in ways that mixed the researcher’s interpretation with the words and views of those interviewed (64). Researchers who thought they did readers a service, by weaving together narratives and accounts of multiple informants in a harmonizing way, actually hid the extent of disagreement and diversity of opinion in the community (xxxviii).. Since primary sources are so valuable, Radin advocated a method of careful direct questioning, a process which under the best circumstances “can become something analogous to a true philosophical dialogue” (xxxi). Radin first published his book in 1927 but came out with a second edition in 1957 which critiqued Placide Tempels’ approach as presumptive and wrong-headed insofar as Tempels presumed to describe Bantu philosophy on behalf of Bantu speaking people, instead of letting them speak for themselves.

Hountondji stated that “Radin’s work is still, to the best of my knowledge, the most lucid ethnological critique of the theoretical assumptions of ethnophilosophy” (African 79). He praised Radin for showing the level of variations in retellings of particular myths and the ways each narrator influenced the myth in their own way, thus demonstrating the “profound individualism” among African intellectuals. Though he faulted Radin for use of the insulting word “primitive,” Hountondji was struck by how, unlike other Western anthropologists, Radin conveyed Africa as a place with views as plural as those of Western societies (African 79). While Radin’s study predated Oruka’s coining of the term “sage philosophy,” certainly Radin’s project shared much in common (both in goals and method) with Oruka’s later project. While Radin’s own first-hand research was with the Winnebago tribe (now more accurately called the Ho-Chunk people) in North America, Radin’s book drew upon primary source narratives of philosophical thought from various communities around the world, including proverbs and poems from Africa.

John Dewey, who wrote the foreword to Radin’s book, thanked Radin for challenging certain common misconceptions of Africa, which tended to present Africans as accepting “automatic moral standards” based on custom, when in fact African communities respected freedom of expression and emphasized individual moral responsibility (Radin xix). The relationship and consistency between Radin’s approach and that of Oruka’s sage philosophy project was alluded to by Kai Kresse (27-28), Lucius Outlaw (in Oruka Sage 244n27), and Godwin Azenabor (73).

While Oruka probably heard about Radin in Hountondji’s 1973 presentation in Nairobi, Oruka nowhere credited Radin as an inspiration for his own chosen methods. In fact, Oruka engaged in a lifelong castigation of anthropologists, condemning them along with missionaries like Tempels. Oruka presumed that all anthropologists anonymized and conglomerated their sources into one, and he asserted that no anthropologist had devised a method similar to his own. Another important distinction to highlight is that Radin made extensive use of proverbs, poems, and songs, which he considered primary sources even if the specific authors were unknown, and found profound philosophical thought in these sources. Many in the field of African philosophy have also argued for using these kinds of sources as philosophical sources, for example, Kwame Gyekye of Ghana (An Essay 8-19) and Claude Sumner, a Canadian who researched Ethiopian philosophy for many years, and Ethiopian philosopher Workineh Kelbessa (“Logic”; Indigenous chap. 11). Even Oruka’s philosophy colleague at UON, Gerald Wanjohi, engaged in extensive analysis of proverbs (Wanjohi Wisdom). Oruka did not consider the study of proverbs to be related to his project. He narrowly focused on interviews with living sages as his only source, despite the fact that other contemporaries of his argued that one could find clear expression of logical argument as well as insightful reflection in proverbs (Sumner 22-23, 391-403). In an article he wrote on Sumner, Oruka mentioned that Sumner spent much effort studying and publishing Oromo proverbs (Practical 156), and maintained that studying proverbs is a different method than ethnophilosophy, but he did not develop these ideas. In Sage Philosophy (1990 ed. 115-16; 1991 ed. 117), the sage Simiyu Chaungo discussed the use of proverbs, but it is the only time proverbs are mentioned in the book.

Along with Radin, Hountonji’s 1973 article also included Marcel Griaule as an example of anthropologists whose methods differed from Tempels’ (31). Griaule interviewed Ogotemmeli, a Dogon elder in Mali, at length. Hountondji was disappointed that certain political factions inside and outside of Africa preferred Tempels’ style of massive, definitive synthesis of all Bantu views to capturing the plurality and disorderliness of individual thought by direct interview. In the preface to the second edition of his book, which included “Philosophy and Its Revolutions,” Hountondji again reiterated his 1974 opinion of Griaule as an important trend-setter:

The French anthropologist had chosen to transcribe the words of one sage among many. He showed the possibility of a long term project which would consist of a systematic transcription of such speeches, at least as a starting point of a critical discussion—what my Kenyan colleague the late Odera Oruka would later call “philosophical sagacity”—rather than as reconstruction of implicit philosophy behind the habits and customs of the host society through a lot of non-verifiable hypotheses which always amount to over-interpreting the facts”(ix).

In 1996, Hountondji saw Griaule’s project as an earlier version of Oruka’s project. He reiterated his estimation of Griaule in his reflections, published in English as The Struggle for Meaning (2002). In this work he reflected on his views back in 1970, saying of Griaule’s work: “Voluntarily assigning to himself the humble task of a secretary, custodian, transcriber of the worldview of a black sage, of one spiritual master among others, the French ethnologist gave the example of scientific patience and, in my eyes, did more useful work than the ethnophilosophers proper who were in a hurry to reach definitive conclusions on African philosophy in general” (99).

Oruka himself was not that impressed with Griaule and Ogotommeli. In his 1983 article in International Philosophical Quarterly, later included in Sage Philosophy, Oruka argued that Ogotemmeli was at best a “folk sage” and not a philosophical sage, because he did not transcend his group’s views. Therefore, Griaule was not engaged in sage philosophy, but only in “culture philosophy” (Oruka Sage, 1991 ed., 34, 47, 49-50).

Hountondji and Oruka both missed research published by other anthropologists in the 1960s that cast doubt on whether Griaule really followed his professed method of interviewing one person and transcribing what that person said. D. A. Masolo made a thorough review of the anthropological literature on Griaule, most but not all of it in French, in which the authors questioned whether the conversation was recorded verbatim on the series of days that Griaule recounted. They suspected Griaule of reconstructing the conversation (Masolo African 69, 77, 260). Jack Goody’s book review discussed the painstaking detail an interview must have in order to meet standards of even a “soft” science like anthropology. The words of the person interviewed should be clearly demarcated from those that are the author’s commentary. Field notes should be identified as such and distinguished from the words of the on-site translator. Original language transcriptions should be available, and the difficulty of translating esoteric words should be discussed by the author. Griaule’s book did not meet these standards (Goody review). Kibujjo Kalumba, who considered Griaule’s book on Ogotommeli one of three possible sources of sage philosophy, complained that the book contained too much of Griaule’s re-wording of Ogotommeli’s ideas (274, 276).

While Oruka declared in 1972 his intent to interview wise elders, he had just the previous year been quite critical of another philosopher’s use of the interview method applied to the topic of Ethics. Tore Nordenstam, a Norwegian based in Khartoum, Sudan, had interviewed three of his students, and on the basis of the interviews, published a book called Sudanese Ethics. In his rather harshly critical review of the book, Oruka questioned how interviews could be helpful at all in the study of ethics.

Oruka himself changed from someone with antipathy toward Nordenstam’s project to a person who promoted a large project interviewing African sages. His own project tried to avoid all of the pitfalls he pointed out in Nordenstam’s project: he did not interview students; he tried to interview those without exposure to studies in European philosophy; he addressed gender issues in most of his interviews; and he asked his interviewees sensitive political questions, even at great risk to himself (as in his interviews with Oginga Odinga). He shared with Nordenstam the focus on ethical issues. Before leaving this section on early precursors and influences on sage philosophy, it is important to note that a Kenyan scholar wrote an article in 1959 that is considered by several African philosophy scholars to be a clear precedent to sage philosophy. Taaita Towett (d. 2007) is known these days mostly for his role in Kenyan education and politics. As Minister of Education, he was “Patron” of the Philosophical Association of Kenya (see Thought and Practice 1.2 [1974] inside back cover). Towett’s 1959 article, translated into French as “Le Role d’un philosophie Africain,” “earlier expressed an identical argument” to Oruka’s, according to Ochieng’-Odhaimbo (“The Tripartite” 30n4). In the PhD thesis he wrote under Oruka’s supervision (later excerpted in Sage Philosophy) and in a 1983 journal article, Anthony Oseghare claimed that Towett’s 1959 article provided “evidence of the existence of critical philosophical reasoning in Africa” (Oseghare “Sagacity” 95; Oruka Sage 1991 ed., 237). D. A. Masolo noted that Towett, as Oruka did later, argued that literacy was not a prerequisite for philosophizing and that Socrates was an example of an oral philosopher. Towett and Oruka both contended that “there must have been African philosophers engaged in the formulation of culture philosophy” (Masolo African Philosophy 236).

3. Beginning Interviews in Kenya

In his published works, Oruka explained that he began his sage philosophy project along with his philosophy colleague Joseph Donders, a White Father from the Netherlands (“The Fundamental” 54n6; Sage 1991 ed., 17-18). Donders explained that the funds for the study were originally received from the UON’s Dean’s Committee (“Don’t Fence” 11).

Oruka’s early publications describing his projects and his methods began in the mid-1970s. At the time, Oruka made it clear that his project was a national one, and was to include wise sages from a wide variety of ethnic groups in Kenya. At this time, there was a lot of focus on building up Kenya’s national identity, and Oruka wanted his project to be a unifier for the country, where all Kenyans could take pride in a common heritage of wise philosophers. He also wanted Kenyans to evaluate and be able to justify their cultural practices (see Oruka “Philosophy”; Ochieng’-Odhiambo Trends, 116-117; Presbey “Attempts”). At the same time, Oruka focused on sages who could articulate reasons for their philosophical and ethical positions that did not rely on mere tradition or on religious authority. He also focused on the individual identities and arguments of the sages rather than melding the ideas of individuals into the “group think” of an ethnic group; to do the latter would have been to engage in the common error in African studies in philosophy.

As F. Ochieng’-Odhiambo has noted, the exact terminology for Oruka’s project has changed over time. In 1974, when Oruka first announced his project, he called it “Thoughts on Traditional Kenyan Sages.” He first coined the term “philosophic sagacity” in 1978, referring to individual critical and reflective sages engaging in thought in such a way that even European or analytic philosophers would have to admit that philosophers were present in Africa. He created and emphasized the approach as an alternative to ethnophilosophy, which he disparaged. Ochieng’-Odhiambo noted that as early as 1983, Oruka called those engaged in philosophic sagacity “sage philosophers.” He contrasted them to ordinary sages (later called “folk sages”) who, in 1983, were not considered philosophical because they lacked critical reflection and ability to create independent positions on topics. In 1984, in “Philosophy in English Speaking Africa,” Oruka used the term “sage philosophy.” At first, the two terms “philosophic sagacity” and “sage philosophy” were used interchangeably and no distinctions were drawn. But during this third stage of Oruka’s works (1984–1995), he used the term “philosophic sagacity” increasingly less, while he used “sage philosophy” increasingly more. Oruka then used the term “sage philosophy” retrospectively to refer to his pre-1984 works (Ochieng’-Odhaimbo, “The Evolution” 19, 24).

The term “philosophic sagacity” Ochieng’-Odhiambo says, was first presented in Oruka’s “Four Trends in African Philosophy” at a conference on Dr. William Amo in Accra, Ghana, in July, 1978 (Oruka Trends 21n1; also see Ochieng’-Odhaimbo “Philosophic Sagacity: Aims”). “Four Trends” was later revised and presented at the World Congress of Philosophy conference in Dusseldorf, Germany, in August, 1978 (Ochieng’-Odhiambo “The Evolution” 22, 30n6). However a Nigerian philosopher, M. Akin Makinde, commenting on Oruka’s popularization of the term, claimed to be the originator of the term in the context of African philosophy. Makinde said he used the term “philosophic sagacity” (with a different connotation than Oruka) earlier than Oruka in a conference paper he presented in June, 1978, at University of Ife (Makinde “Robin”; “Philosophy” 107). Makinde’s 1978 paper drew upon concepts in Bombastus Paracelsus’ essay Philosophia Sagax. Collins English Dictionary explains that “philosophic” is a term created in Middle English around 1350-1400 C.E. that meant “learned, pertaining to alchemy.” Makinde claimed that Oruka used the term and concept “wrongly” but admitted that Oruka’s usage became the more widespread (African 9, 122, 137). Many scholars in African philosophy do not pay attention to the term “philosophic” and refer to Oruka’s method as “philosophical sagacity” (for example see Hallen African 68-75; Imbo 25-26).

Oruka articulated his project and his methods in the context of growing debates on the topic of African philosophy. He spearheaded the founding of the Philosophical Association of Kenya and the creation of its journal, Thought and Practice, in 1974. In his famous “Four Trends” article, he divided African Philosophy into four diverse interests/trends with differing methodologies (ethno-, nationalist-ideological, and professional philosophies including his own, philosophic sagacity). At these venues and in publications he explained how his own project was not just another example of the wrong-headed “ethnophilosophy” approach (criticized by Paulin Hountondji) but was instead an alternative to it.

In a 1988 article of Oruka’s first published in German and later included in English in Trends (50-69), Oruka described his sage philosophy project, listed eight sages (all men) who were part of his study, and gave a biography of each. Two of them, Paul Mbuya Akoko (d. 1981) and Oruka Rang’inya (d. 1979), would be included at greater length in his soon-to-be-released, book-length study of sage philosophy. The others mentioned in 1988 had only biographies and short excerpts of their interviews in the German-language article, which were repeated in two books. These latter sages were Njeru wa Kanuenje, Nyaga wa Mauch, Arap Baliach, Muganda Okwako (d. 1979), Joash Walumoli, and Kasina Wa Ndoo (Trends 57-61, 66-67; Sage 1991 ed., 37-40). Oruka explained that he and researcher Jesse N.K. Mugambi interviewed Njeru wa Kanyenje of Embu district together, in the Embu language (Trends 66, 132).

Oruka’s book Sage Philosophy was published first by Brill in 1990 and later in Nairobi in 1991. There are a few differences between the two publications, but most changes are minor editorial ones, with the major exception that chapter one of the Brill edition has an extra twelve pages telling the background of the study. The book has three parts. The first is Oruka’s introduction to his project. Here, Oruka gathered (with little revision) several of his articles on sage philosophy that had been published over the years. The second part includes interviews with sages, and the third part includes commentators and critics. Documentation of the sages as individuals, and the publication of their originally oral philosophical thoughts, are crucial to Oruka’s methodology; this stands in contrast to ethnophilosophy’s practice of summarizing what informants (often anonymous) say and searching for a common denominator. Also in the second part, a brief biographical sketch and photograph precedes each interview. Oruka insisted on identifying both folk and philosophic sages in the same manner. In this way, his project does not merely repeat the same ground covered by ethnophilosophy.

The book minimizes the editorial/interpretive role of the professional philosopher, in comparison to other anthropological approaches, by including direct excerpts from interviews of sages who were self-conscious of their role as cultural critics and were respected for the critical views they articulated. Interviews with sages covered topics related to philosophy of religion (such as the existence of God, life after death, and so forth), free will and determinism, and ethics. These topics were of central concern to Oruka, whose own academic background from Uppsala was in practical philosophy rather than theoretical philosophy. Oruka mentioned “Chaungo Barasa, Fred Ochieng’-Odhiambo, Sam Oluoch Imbo, Samuel Wanjohi Kimiti and Mwangi Samuel Chege” as his key research assistants in the project (Sage 1991 ed., xi).

Oruka closely followed this first book-length publication with a monograph focusing on the interviews of Jaramogi Oginga Odinga. He explained that for the 1982 interviews he was accompanied by E. S. Atieno-Odhiambo, a well-known Kenyan historian who focuses on oral history, and in 1992 Chaungo Barasa assisted him. Odera Oruka provided his own commentary on the interviews, which focused on Odinga’s love of truth, and how Odinga’s commitment to truth and love of the masses contrasted with Plato’s own position in the Republic regarding the myth of metals, sometimes called the “noble lie” (Oginga Odinga xi, 3-4, 12-13).

4. Relationship to the Hallen-Sodipo Study

Barry Hallen and J. Olubi Sodipo engaged in a research project that involved interviewing wise men among the Yoruba in Nigeria. They began their project around the same time as Oruka, in 1973-74. As Hallen and Sodipo explain, they started in 1973 with a non-credit student study group at the University of Lagos. During university breaks they asked students to “establish face-to-face fieldwork relationships with the elders and wise men of their family compounds, villages, and towns” (Hallen and Sodipo 9). They chose the concept of the person as the theme for these first discussions. After this first study, they interviewed people in the Ekiti region from 1974-84 and moved the project to the University of Ife (now Obafemi Awolowo University) in 1975 (Hallen and Sodipo xvi, 11). Sodipo became head of the newly independent philosophy department that separated from the religious studies department in 1975.

Hallen and Sodipo chose to study herbalists and native doctors because they were more critically sophisticated than the “ordinary persons” whom they advised, and were able to offer theoretical concepts (10-11). They explained that the onisegun (Yoruba wise men) they interviewed were organized into their own professional society called an egbe, with rules, evaluations, possible reprimands, and a pledge of secrecy. The onisegun were not mere masters of medicine, but rather, they “[gave] advice and counsel about business dealings, family problems, unhappy personal situations, religious problems, and the future, as well as about physical and mental illness” (13). They did not name their individual interlocutors because, as they explained, those they interviewed requested to remain anonymous (14). They acknowledged that the practical questions regarding interviewing methods were many, and they tried to sort out the question: “is each man to be treated as an individual, potentially eccentric thinker, or are opinions to be somehow collated and presented as shared and communal?” (8). They followed the latter plan, due to the fact that they were studying language use. Their study had philosophical insights regarding how the use of words “knowledge” and “belief” were understood, and came to note that among the Yoruba, the use of the term translated as “knowledge” is much narrower than the usage in Britain or the United States, because it was reserved for first-hand knowledge alone. In Britain or the U.S., people commonly claimed to know a vast amount of information (in the form of propositions) that went beyond their first-hand knowledge (see Hallen and Sodipo; Hallen “Yoruba”).

Because it involved academic philosophers interviewing wise elders in Africa, many people associated the Hallen and Sodipo project with Oruka’s sage philosophy project. However, at least in some of his writings, Oruka clarified that he did not consider their work that of sage philosophy due to its lack of emphasis on individual sages. In fact, Oruka complained that it looked like the onisegun of the study held views “in consensus” and therefore to study their views was “anthropology, not philosophy” (Oruka Sage 1991 ed., 8-10; quote, 10), or even “culture philosophy,” “cultural prejudices” or “philosophication” (Oruka Sage 1991 ed., 50). “Philosophication” is a term that Oruka intended to have a derogatory tone. At one point he defined it as “the discovery of a philosophy out of no philosophy;” he also played with coining the word “philosofolkation” which involved loving the “folk” so much that one invented a philosophy for them and made oneself its spokesperson (1990b, 7). Oruka’s criticisms began as early as his 1975 article, when he charged J. O. Sodipo with trying to pass off African superstitions regarding the agency of the Yoruba gods as an African understanding of cause and, hence, philosophical (Oruka “The Fundamental” 48). In a more conciliatory tone, he wrote in his 1983 article that the Hallen-Sodipo project, like Griaule’s Ogotemmeli, while not “philosophic sagacity,” may be “some form of sagacity” (Oruka “Sagacity” 389; Ochieng’-Odhiambo Trends 133).

On this point, Ochieng’-Odhiambo pointed out (“The Evolution” 27) that a particular end note in an article of Oruka’s 1990 book, Trends in Contemporary African Philosophy (Oruka Trends 68), suggested that Hallen and Sodipo’s project might be part of sage philosophy, despite Oruka’s clarification in other works (Oruka Sage 1991 ed., 8-9, 50) that it was not. This endnote is a bit indirect. Oruka listed Hallen and Sodipo’s works along with several others that directly address sage philosophy, and then added the caveat, “It is not the case that every one of these writings addresses itself to the direct question of Sage philosophy. But they all make special reference to a type of thinking in Africa that can only owe its existence to the thoughts of some wise men (and women) in traditional Africa.” This statement makes it sound like Hallen and Sodipo were fellow travelers. Interestingly enough, Oruka mentioned that at a certain point in his research he interviewed some sages who wanted their names withheld (Sage 1991 ed., 65n4), and he mentioned specifically a parallel with Hallen and Sodipo’s study.

In his 2006 book, African Philosophy: The Analytic Approach, Hallen agreed that it was best to keep his own project and Oruka’s separate. As good grounds for separating them, Hallen explained that his and Sodipo’s project was always intended to be an exercise in philosophy of language, and he admitted that such was not the case with Oruka’s interviews. He also acknowledged that Oruka wanted to keep them separate (4–5). But he also explained, in Knowledge, Belief, and Witchcraft, that he thought that the kinds of description of their project that Oruka engaged in were unkind and unfair. Oruka did not take into account that when one does philosophy of language one cannot help but search for common usages of terms and concepts. Hallen recounted in an afterword to the 1997 edition of Knowledge, Belief, and Witchcraft the shock he experienced upon first reading criticisms of their work such as this. He and Sodipo had been bracing for criticisms from anthropologists; they expected to be told that they weren’t properly trained to do fieldwork. But they were surprised to find themselves criticized by philosophers for advocating a communal consensus account of African thought, basically being accused of the dreaded “ethnophilosophy” as Hountondji had described it.

Hallen asked Hountodji and Oruka to rethink their criticism, since there was no way to practice ordinary language philosophical analysis, whether in Africa, England, or elsewhere, without focusing on common meanings. Hallen thought that the fact that their study was able to debunk many prevailing myths and stereotypes about Africa, including misconceptions made popular by some anthropologists that considered African thought as pre-reflective, uncritical, traditional, emotional, and non-reasonable. This was evidence that they should be appreciated, not lumped in with anthropologists and ethnophilosophers whose projects were evaluated negatively (Hallen and Sodipo 136-37n16; 140). Indeed, one of the surprising conclusions of Hallen and Sodipo’s study was that the onisegun had such stringent criteria for counting something as knowledge (that is, restricting it to first-hand experience, and requiring careful reporting and testimony from all witnesses), that they made Euro-Americans who accept second-hand propositional knowledge as true seem “dangerously naïve or perhaps even ignorant” in comparison to the onisegun (Hallen Yoruba 299).

While discussing parallels in Nigeria, it is important to note that Campbell S. Momoh (d. 2006) engaged in interviews with elders of the Uchi community. Momoh says he responded to Hallen’s call for philosophers to go to villages to discuss philosophical topics with illiterate elders (Momoh “African” 99). He cited as his intellectual sources for the methodology of the project not Oruka but instead both Paul Radin and William Abraham. In his 1962 book, Abraham distinguished public philosophy from private philosophy, referring to Griaule’s study of Ogotommeli as an example of “of an individual African philosopher rather than a repository of the public philosophy” (104). Momoh saw a commonality between Radin’s notion of the African intellectual and what Abraham called “private philosophy” (Momoh The Substance 53, 55). Momoh insisted that interviewees should be named and credited.

Momoh was himself involved in interviewing elder sages. He did his dissertation fieldwork in 1978 and submitted his dissertation in 1979 to Indiana University. His dissertation committee included William Abraham and Ivan Karp (An African Conception). The dissertation includes lengthy sections naming elder interlocutors (such as Aliu Oshiothenaua, Saliu Ikharo and others), paraphrasing their conversation in detail as well as quoting them directly (92-120). Momoh also provides contextual background of the sages’ standing and purpose in their communities (see especially 45-48, 67-70, 85-87). He even mentions seeming interruptions in the discussion, such as the presence of a young boy or a chicken, and how the conversation is shaped by these interactions (something for the most part missing from the interviews in Oruka’s study). Topics focus on metaphysics and ethics. Along with accounts of the elders’ discussions, Momoh includes his interpretation and analysis of what the elders say. While the elders may convey their ideas in story and myths, these do not just reflect communal philosophies since some of the stories are creations of individual men (for example, Ikharo’s story of woman’s refusal to accept marrying man as her God-given duty and role, see 116-117).

In his published work, Momoh names some elders, quotes them verbatim, and gives specific examples of methodological challenges during his interview of them (“African Philosophy” 87-88). He named Aliu Oshiothenaua, Pa Egbue, Pa Abudah (Momoh’s uncle), and a hunter named A. M. J. Momoh (The Substance 66, 245, 254-55, 376-78). He found in the interview of the hunter a “doctrine of existential gratitude” (The Substance 382). Oshiothenaua asserted a theory of human dependence on nature (The Substance 376). An ethnophilosophical study that merely explored communally held beliefs in the sense of Abraham’s “public philosophy” would be incomplete, Momoh insisted, because “alongside with it” it would need to name individual intellectuals and add additional contextual information such as the time period, cultural paradigm, and branch of philosophy relevant to the discussions. He criticized Bodunrin, who wanted to make an “absolute dichotomy” between ethnophilosophy and the sagacious elders, since, according to Momoh, the latter were based on the former–that is, the “sagacious elders” philosophized in a general context provided by public philosophy (“African” 77-78, 80-81; The Substance 56, 58, 59).

Momoh also insisted that sagacious elders had a better practice than much of contemporary analytic academic philosophy, since their goal was not the narrow one of negatively appraising received ideas, but the broader project of building holistic systems and attending to important moral issues (“African Philosophy” 91; The Substance 69, 75, 78). While Oruka notes that in Momoh’s earlier 1985 article Momoh seemed unaware of Oruka’s sage philosophy project (Oruka, Sage 1990 ed., xxiv) and castigated Oruka as a member of the “African logical neo-positivists” who denigrated ancient African philosophy (Momoh based this estimate on Oruka’s 1972 article critical of myth, see The Substance 64), he later revised his estimate of Oruka and acknowledged his sage philosophy project (The Substance x). In an article originally published in 1987 (included in Sage 1990 ed.), Oruka expressed his agreement with C. S. Momoh’s position that the names of sages interviewed must be given and their views credited to them (Sage 1990 ed., 20). Fayemi Ademola Kazeem considered Momoh to be engaging in a sage philosophy project as was Oruka, noting that Momoh preferred to call it “ancient African philosophy” (Kazeem 196).

Godwin Azenabor included Hallen and Sodipo, Momoh, Oruka and others in a common category of African philosophy which he called the “Purist school” because all were committed to the assertions that Africa has a similar practice of raising philosophical questions and answering them as does the West; however they all saw the need to break free of Western paradigms, conceptual schemes, and conditioning. All in the Purist School emphasized the relevance of African culture and tradition for both philosophy as well as models for African development (Azenabor Understanding xiv). While the choice of “Purist” as a descriptor can be questioned (see Sophie Oluwole’s defense of Oruka’s project as admitting up front the multiple influences on contemporary rural sages, in Graness and Kresse Sagacious 155), Azenabor’s categorization helps us to see the common themes and approaches of authors who emphasized their distinction from and competition with each other.

5. Folk Sages and Philosophic Sages

In some works Oruka was at pains to distinguish “folk sages” and “folk sagacity” (wise elders who could recount community traditions and beliefs but not take a critical, evaluative stance toward them) from “philosophical sages” or “philosophic sagacity” which were the interviews and ideas of particularly reflective and evaluative sages. The distinction copied “first order” and “second order” distinctions in philosophy to a great extent. Many philosophers concluded that the only important part of the sage philosophy project was the “philosophic sagacity” part. However, such an approach left unexplained the role that folk sages played in the project. Why continue to include folk sages if they are examples of unphilosophical individuals? Several scholars addressed this thorny topic (Presbey “Sage Philosophy: Criteria”; Van Hook).

Omedi Ochieng noted the irony that while Oruka first began his project to debunk Western scoffers who thought Africans were involved in unreflective groupthink, his comments championing the philosophic sages as “geniuses” in contrast to folk sages and other Africans who were satisfied at following others and not thinking for themselves ended up reinforcing the negative stereotype of Africans (“Epistemology” 348-351). He thought that Oruka capitulated and accepted academic definitions of philosophy that belittled folk wisdom and championed abstraction in a way that silenced the important contributions of many Africans (“Ideology” 153-57). Oluwole likewise noted that in some of Oruka’s texts he seemed to define “philosophy” so narrowly that even his own sages would fail to meet such narrow criteria, which would ironically lead to the failure of his own project. She insists, however, that if the sage interviews could be approached by sensitive scholars familiar with the sages’ language and context, without the near-ubiquitous prejudice against finding philosophy in African oral practices, that the project in this sense is very promising (Oluwole in Graness and Kresse 158-61).

An additional problem is that even when Oruka sorted out his folk and philosophical sages, the folk sages still demonstrated the intellectual virtues Oruka insisted belonged only to the philosophical sages. To illustrate this point, let me highlight that each of the seven “folk sages” in Sage Philosophy (chap. 6) distinguished their views from those of their communities on at least one topic. Chege Kamau said that he didn’t believe the afterlife consists in ancestral spirits as others believe. Rather, he posited, all people rejoin one big soul, which he called God. Joseph Muthee advocated sometimes unpopular inter-tribal marriages as a means of building a national culture. Ali Mwitani Masero argued that death is the end of the human being. Zacharia Nyandere said he believed men and women were equal, despite Luo perceptions to the contrary. Abel M’Nkabui said all humans were equal, and that inequalities were historical accidents. Based on this conviction, he was critical of Meru prejudice against blacksmiths. Joseph Osuru said that the Teso think that God does not belong to other tribes or races. But he thought that God belongs to all people. He also mentioned that some Teso think that having dreams of the deceased is proof that they live in a world after death. But, he pointed out, having a dream is not proof. Peris Njuhi Muthoni said that it was good that the practice of female circumcision is dying, because it led to medical problems. She stated that it was her conviction that Luo should not remove their teeth as a rite of passage. These concrete examples show that all of the so-called “folk sages” can critique their own societies, an attribute Oruka assigned only to the “philosophic sages.”

Oruka listed “philosophic sages” in their own chapter (chap. 7). The sages included there were Okemba Simiuyu Chaungo, Oruka Rang’inya, Stephen M. Kithanje, Paul Mbuya Akoko, and Chaungo Barasa (Sage 1991 ed., 109-155). An additional aspect of the sage philosophy project was that Oruka did not want the project to stay on the descriptive level. He wanted Kenyans to read and grapple with the ideas of the sages, evaluate them, extend them, and apply them to their lives. However, his own published commentary on the interviews was brief (Trends 64-65). In Sage Philosophy, he left the job of commentary on the interviews to his student, Anthony Oseghare (Sage 1991 ed., 156-160).

D. A. Masolo made the point that it is not mere disagreement with one’s cultural group that makes one a philosophic sage, but rather that “the criterion for a moral ideal, according to the sage, is not that it match the historical belief of the community but that it satisfies an acceptable idea of right, fairness, and respectfulness toward all those who are involved or may be affected by its practical application” (Masolo “Sage”). He gave the example of a sage who would counsel against the practice of a certain ritual if it would jeopardize the health of an individual. In these circumstances, the important criteria “was not their mere variance from the communal beliefs of the sages’ own groups but also a theoretical account provided by the sage as the foundation of his or her own view. . . The sage attends to the rationality of views rather than to the judgment of the group” (Masolo “Sage”).

One of the tensions found within sage philosophy is that, while Oruka privileged sages critical of their societies’ prejudices, as in the examples above, on the other hand he championed sages who hold in high esteem traditional values forgotten or marginalized by young Kenyans. In a 1979 research proposal for sage philosophy, he explained that his project was a way of defending his nation from the “invasion by foreign ideas,” which could not be stopped by guns but instead must be combated on the level of ideas. This cultural invasion included worship of technology and an adherence to crass materialism as a measure of success. Oruka bemoaned the fact that African traditional morality was already eroded by European colonialism, and their replacements, Christianity and Islam, he argued, were incapable of standing up to the cultural erosion of values (“The Philosophical”).

Oruka often asked questions about the proper relationship between men and women during his interviews with sages. Many of the sages insisted that women were inferior to men. Oruka cautioned readers that the sages were reflecting the cultural prejudices of their times, and he reminded those familiar with Western philosophy that such assertions of women’s inferiority could be found as well throughout the Western canon of well-known and respected philosophers. Still, he was proud of the fact that some of his sages held relatively progressive views on this topic (Sage 1990 ed., xix-xx; Ochieng’-Odhiambo Trends, 136), and he even had one sage’s views on the topic published in a Nairobi newspaper (“Paul Mbuya”). The views asserting men’s superiority could be found in the sages interviewed by his student F. Ochieng’-Odhiambo and Ngungi Kathanga. In Oruka’s studies as well as his students’ studies, few women sages were interviewed. Gail Presbey has drawn attention to women sages in her works (“Who”; “Kenyan”).

6. Criticisms of Sage Philosophy

From early on, critics from within the community of African philosophy scholars put forward their criticisms. Oruka included three critics (Bodunrin, Kaphagawani, and Keita) and three supporters (Outlaw, Oseghare, and Neugebauer) in Sage Philosophy. Peter Bondunrin said Oruka’s sages were not enough like the Greek philosophers, who expounded their view in a context of literacy (Oruka Sage 1991 ed., 163-179, esp. 168-69). Lansana Keita said that when Oruka relegated creative individual thinking to the critical views of “philosophic sagacity,” he failed to acknowledge that the folk or ethnophilosophy of the community could itself be a product of earlier creative individual philosophizing (Sage 1990 ed., 210). While some of these criticisms were perhaps based on a misunderstanding of Oruka’s project (see Bewaji review 109), Oruka did appreciate the debates that ensued and responded to these critics in his own articles, which were included in the first part of the book.

After the publication of the book, criticism continued. D. A. Masolo said the sages Oruka quoted often made comments that were no more than common sense, perhaps with some cleverness thrown in, rather than sustained arguments (Masolo African Philosophy 236-245). Ochieng’-Odhiambo had a clever and insightful response to this kind of criticism. “The idea that philosophy must always operate at a higher rarefied level with deep abstractions is not always true . . . Philosophy can, in many ways, be expressed very simply”; in fact, he agreed with Christopher Nwondo, who advocated that philosophers in Africa should attempt to write in clear and simple language (Trends 138). But Ochieng’-Odhiambo did clarify that Masolo was not against the sage philosophy project itself, but had just stated that he thought the interviews included were not yet strong enough to prove the point to his liking (Trends 137).

Tunde Bewaji reviewed Sage Philosophy and was impressed by Oruka’s sage interviews because they “reflect a clarity of thought which is not seen in ethnographic, anthropological or sociological studies” (106). While Simiyu Chaungo argued that God was the sun, because without the sun there could be no life, Ali Mwitani Masero, on page 96 of Sage Philosophy, argued that if God created the sun, God cannot also be the sun. Bewaji also commended Osuru’s criticism of popular practices that regarded dreams as evidence about the afterlife. Bewaji pointed out that many persons from so-called civilized societies still consider dreams evidence of another world. He also commended Kithanje for arguing that there could not be many gods, because such gods could not account for the uniformity of creation (106-07).

In chapter four of his book, Philosophy in an African Place, Bruce Janz reflected upon Oruka’s sage philosophy project. He noted that the approach seemed to solve the paradox of African philosophy by appealing to universal principles of reason and exploring the context of African lived experience. Yet, Oruka imported Western philosophical ideas to a large extent and left them mostly unacknowledged. This was problematic since his project purports to be all about African philosophizing. Additionally, Janz offered critiques of the methodology. The method at first looked promising, by focusing on conversation between sage and the interviewer (an academically trained philosopher) where the two cooperatively worked toward truth. Yet, to Janz, it often sounded nevertheless like it was the academic philosopher who focused upon and made manifest the latent reasoning in the sage’s conversation. Janz noted that past, outmoded ethnographies turned Africans into objects of others’ studies and declared that he therefore preferred open-ended conversation. But the structure of questions that most sages were asked in interviews steered them toward certain answers that fit in the context of past Western philosophical paradigms such as asking for an essence (What is wisdom? What is virtue?). Such questions presumed that increasing levels of abstraction were abilities to be praised in a sage. Interviewers guided the sages, he argued, by eliciting the sage’s opinion on topics that the interviewer thought important. Janz also took Oruka to task for promising to evaluate which of the sages were wise according to an objective criterion. Janz noted the complex and multiple aspects of being a wise person, and suggested that it would not be easy for anyone to sort out the wise from the not-wise. Further, Oruka did not address whether or not wisdom is a culture-bound concept. Janz suggested that wisdom was better recognized intersubjectively, identified in “a process of explicating shared meanings in a community, rather than identifying an essence” (107).

Omedi Ochieng likewise insisted that the sages be placed in a context where their speech could be understood contextually, and he found several places where Oruka failed to fill in important aspects of context. In fact he questioned the “interview” as Oruka’s chosen method, suggesting that sages might not understand an interview as a context in which to justify their philosophical beliefs when challenged by a provocateur. Adversarial debate is a particular form of philosophizing that may not be valued by the sage. But Ochieng did think that interviews with sages in some form should still be done in a “reconstructed” version of African sage philosophy (“Epistemology” 346-47, 359).

Janz similarly suggested that Oruka depended too much on the idea of philosophizing as critique and divergence from communally accepted beliefs. Why not look for other signs of wisdom, such as creative thinking? Janz found many examples of creative thinking among the sages, such as Stephen Kithanje’s “fecund metaphor of God being like heat and cold.” Likewise, Okemba Chaungo showed through his debate of the relative good of wisdom versus land that the seeming contradiction could be overcome by understanding different senses of “good” (109). In general, Janz was frustrated that sage philosophy was not more self-critical about its methods, did not come to terms with its positionality, and did not devote time to critiquing its own methods.

W. J. Ndaba critiqued Oruka’s work, arguing that the ideal of philosophy as “an individual, explicit, critical and self-critical ratiocinative consciousness” was a Western notion, since such emphasis was “counterproductive for the emergence of a genuinely rooted African philosophy” (17). He held that an African perspective would value the folk sage, that is, the person who consulted the wisdom of their community and did not try to do it alone. He referred to the Zulu proverb, Iso—elilodwa—kaliphumeleli (“An eye—when it is one—does not succeed”), to emphasize the importance of consulting other persons who could “note points of detail which elude him or unforeseen snags which turn up to mar his plan” (20-21). He disagreed with Oruka’s claims that the philosophic sage was more valuable than the folk sage. He did, however, appreciate Oruka’s emphasis on the philosophical sage being able to warn society against holding one-sided or close-minded, ethnocentric views.

While there have been critics of sage philosophy, there have also been many scholars who have appreciated its contribution. In addition to those already mentioned above, substantive treatments of Oruka’s project can be found in the works of Lucius Outlaw (in Oruka Sage); Sophie Oluwole, Muyiwa Falaiye and Ulrich Loelke (in Graness and Kresse), and others.

7. Culture Philosophy and Its Relationship to Philosophic Sages

Oruka was convinced, both by his training in practical philosophy as well as his own sense of values and priorities, that philosophy in general, and the sage philosophy project in particular, had to address itself to the concrete problems facing Kenyans and Africans. It should address issues in the present and suggest a course of action to make Africa’s future better.. Thus, he wanted his project to be both practical and accessible to a general audience beyond academia. He often wrote for the newspapers, such as the Daily Nation, and other popular publications. In 1986, he participated in a study sponsored by the Institute of African Studies at the University of Nairobi called “Kenya’s Socio-Political Profiles” where he was required to contribute a broad outline of the general beliefs and practices of the Luo ethnic group (Oruka Sage 1990 ed., 53, 58-61). In 1986 he became an expert witness for a now-famous trial often referred to as the S. M. Otieno burial saga. Oruka took the witness stand, and gave an account of the philosophy and practices of burial among those from the Luo ethnic group. He argued that his expertise was due to his study of so many interviews with philosophical sages from the area. He included a transcript of his evidence in court in Sage Philosophy (1990 ed., 65-80).

Note that “culture philosophy,” that is, an account of the prevailing beliefs of an ethnic community, was an offshoot of interviews aimed at discovering philosophic sagacity. In order to see how a particular sage deviated from norms in his individual, critical thinking, the sage often began by recounting reigning shared values in his community. This “offshoot” of sorts (which Oruka had before dismissed in a disparaging way as philosophy only in a broad or even “debased” sense) now became a focus. Some experts in customary law even accused Oruka of giving the court an outdated account of practices, presented as timeless truths of the Luo ethnic group (Cotran 155). When Oruka was in the witness stand, Khaminwa, Wambui Otieno’s lawyer, asked him whether in traditional society there may be people opposed to customs who want to depart from those customs and do things their own way. Oruka explained to Khaminwa that “in a traditional communal society there were very few rebels” (Sage 1990 ed., 70). He minimized the existence and role of such dissent, even though in his academic work on sage philosophy he particularly championed such dissent.

Rather than see him as taking on the role of ethnophilosopher, Ochieng’-Odhiambo suggested that, at that point, Oruka showed that he himself was a philosophic sage able to recount the traditions of his ethnic group while also resolving any inconsistencies (Ochieng’-Odhiambo Trends 125). Masolo thought that Oruka’s popularity grew because of his role in the trial, due to his ability to unmask the faulty logic of the widow’s defense team that equated “modern” with “Western” in a stereotypical and unfair way (“Sage”). Be that as it may, the court case can also be seen as another missed opportunity for Oruka to champion the rights of women in a male-dominated context (Presbey, 2012, 2013).

The court case was the beginning of a new phase in Oruka’s sage research. As Oruka explained, due to his notoriety in the case, he was offered work sensitizing District Officers and Commissioners to Luo philosophy and customs. When he gave these talks, he reiterated common beliefs among the Luos and quoted individual philosophical sages (Sage 1990 ed., 58-64). He also put his sage sources to use when studying Kenyan beliefs and practices regarding family planning, for the Department of Populations. He had two control groups, non-sages and sages, and gave the views of both. His main point was that Kenyan traditions and values already had the resources for population control through natural family planning. Further, a sensitive study of the culture of Kenyan people could reveal attitudes and practices that worked against family planning and then point the way to solutions to the problem. Here he seemed to have crossed over quite a bit into the social sciences. Dorothy Munyakho explained that his approach was still considered experimental and controversial from the perspective of people in Population Studies who were more familiar with demographics and statistics than with qualitative analysis of interview content (21).

Critic Didier Kaphagawani, in a 1987 article reprinted in Sage Philosophy, charged sage philosophy with being parasitic on ethnophilosophy, insofar as philosophic sages practiced second-order reflection and analysis of first-order ethnophilosophy (Kaphagawani in Oruka Sage 1991 ed., 181-204). But Oruka responded and clarified. He said instead that philosophic sagacity is second order to culture philosophy. Sages reflect upon the culture, though not as it is summarized in consensus form and analyzed by professional philosophers, theologians, or missionaries (as in ethnophilosophy); rather, they do so based on their first-hand observations of the culture philosophy through their personal experiences in the community (Sage 1991 ed., xxiii). This same point could serve as a fine-tuned criticism of Momoh’s terminology mentioned above, since Momoh sometimes referred to ethnophilosophy and communal philosophy without distinction. Momoh added the helpful point that all communal philosophies, not just African communal philosophies, are non-critical, and he gave some examples from Britain (The Substance 59, 63).

In an article, “Sage Philosophy Revisited,” based on a radio interview in 1993 and published posthumously, Oruka noted that some scholars considered his project “just one of the brands of ethnophilosophy,” similar to Mbiti and others, and disagreed with those critics (Practical 183). He agreed that he studied “culture philosophy” and described it as the “beliefs, practices, myths, taboos, and general values of a people” (Sage 1991 ed., xxiii). To the end, Oruka trusted his method more than that of ethnophilosophers like Tempels because he based his accounts of culture philosophy on the testimony of trusted indigenous experts (the philosophical sages), and he considered himself to be conveying only what they had told him (Sage 1990 ed., 57; 1991 ed., 43n2). Of course, there is no escaping one’s role in shaping the data insofar as the researcher, even Oruka himself, decides which parts of which interviews to highlight when presenting them to others. This methodological point was raised by Emmanuel Eze regarding Oruka’s work (Eze and Lewis 19).

It’s important to note that as time went on, ethnophilosophy’s staunchest critic, Paulin Hountondji, modified his position. He reflected on the debate that was started by his criticism of ethnophilosophy and said in 2002 that his earlier rejection of collective thought was excessive. He explained that collective culture must be taken seriously, and that individuality is fashioned from a basic personality, which has rootedness. While he agreed that individual thought should be seen in cultural context, he noted that it should not be stuck there. Roots should not become a “prison house” (The Struggle 128, 151-52, 204-05). Also, one of Hountondji’s biggest complaints about the ethnophilosophers like Tempels was that they were foreigners, or if not foreigners, at least they were writing for a foreign audience, responding to debates and criteria created abroad. Hountondji called this “extroversion,” and wanted instead to have African philosophy being written by Africans and responding to the interests and needs of Africans (“Introduction”). Certainly, the trajectory of Oruka’s interests in the sages showed that over time, the issue of proving anything to outsiders diminished in importance, as the question of how sage wisdom and reflection could help Kenya and Africa took center stage (Ochieng’-Odhiambo “The Tripartite” 21, “The Evolution” 29, and “Philosophic” 78; Kalumba 39-40; Presbey “Sage Philosophy: Criteria”).

8. Oruka’s Sage Philosophy: the Last Few Years

Oruka intended his sage philosophy project to continue to grow. He called his 1992 book, on former Vice-President of Kenya Jaramogi Oginga Odinga, a continuing study in sage philosophy (Practical 162). In many respects, Oginga Odinga was quite different than the other sages, insofar as he was literate, had formal education and extensive experience in government (being first vice president of Kenya and later a presidential candidate) and had also traveled abroad. Nevertheless, Oruka insisted that in Oginga Odinga’s role as ker, that is, spiritual and cultural leader of the Luo people, he maintained with the other sages an important commitment to the betterment of his community. Oruka also clarified that, while he had begun his sage philosophy research interviewing illiterate elder sages, because their testimony might soon be lost, he never intended his project to be limited to the illiterate, elderly or rural persons. Thus, speculations that his project would become out of date the more that literacy spread in Africa were based on a misunderstanding of his project (Sage 1990 ed., xviii). Indeed, in Sage Philosophy, he included an interview of one young, educated sage, Chaungo Barasa (a water engineer), due to his wisdom and his commitment to his community (1990 ed., 149-57).

Oruka articulated and emphasized other reasons to continue sage philosophy as a project, including the need for a generation of Kenyans who grew up in cities to remain connected to their roots. He was also concerned with the practical challenges of poverty and corruption and curtailment of liberties in Kenya. He thought that sages, from the obscure rural ones to the more famous ones like Oginga Odinga, could offer a bold moral critique of Kenyan society that could help people improve their lives both individually and as a community and nation.

Oruka’s life was cut short in a road accident in December of 1995. As a pedestrian, he was struck down by a motorist in the streets of Nairobi (Nation Reporter 40). Further studies in sage philosophy have certainly been stymied by this loss but not wholly halted. Anke Graness and Kai Kresse quickly assembled scholars to comment on sage philosophy’s legacy in a memorial book to Oruka that came out shortly after his death, Sagacious Reasoning. A book of essays that Oruka had been working on at the time of his death, Practical Philosophy, was subsequently published. This book divided Oruka’s essays into four sections, one on African philosophy and culture and the other three covering issues of truth and faith, value and ideology, and environmental ethics. Excerpts of sage interviews can be found in some collections on African philosophy (see Oruka’s interview of Paul Mbuya Akoko in Hord and Lee 32-44).

9. Sage Philosophy Research by Other Philosophers: Students

To explore the ongoing influence of sage philosophy, it’s best to cast a wide net. While “philosophic sagacity” was a specialized part of sage philosophy, the project also included folk sages and culture philosophy. It makes sense to survey those who found Oruka’s emphasis on the interview process central to their own work in African philosophy. Some of these persons did not mind drawing upon interviews as well as proverbs. Many provided extensive historical background and filled in details of the context of those they interviewed to a far greater extent than Oruka ever did in his studies, and they did so for good methodological reasons. Some refined the interview method beyond Oruka’s own practice, going more in-depth, refraining from misleading questions, and some even preferred participant observation to interview. With all of these variations, it is best to understand these works as influenced by Oruka and perhaps even as improvements on his project, rather than as strict copies.

This survey will begin with those who had been Oruka’s own graduate students. Most published work beyond their original theses and many became scholars in their own right. During Oruka’s time at University of Nairobi, MA and PhD students such as Kenyans Ngungi Kathanga, Oriare Nyarwath, Patrick Dikirr, F. Ochieng’-Odhiambo (“The Significance”), Wairimu Gichohi, and Nigerian Anthony Oseghare incorporated sage philosophy as a topic and/or interviews with sages into their studies while under Oruka’s supervision. Some of them published articles sharing their research with others. Oseghare’s thesis reiterated many points of Oruka’s own position—holding a universalist definition of philosophy, limiting investigation to texts that met the philosophical standards of being critical, rigorous and of a second-order activity—and analyzed three sages according to this criteria. Two of the sages appeared in Oruka’s book, and Oseghare’s commentary on those two sages was excerpted and included in Sage Philosophy. But the thesis included discussion of a third sage, Oigara from the Kisii community. Oseghare liked Oigara best because unlike Oruka Rang’inya (who happened to be Oruka’s father) who explained the psychology behind “explaining events through the activities of spirits as a ploy of encouraging good behavior,” Oigara instead directly appealed to individuals’ abilities to make rational judgments (Oseghare xii). Oseghare concluded that the sages met his criteria for philosophical thinking.

Gichohi analyzed the interviews of sages included in Sage Philosophy (1991), finding contradictions in the concepts and positions held by some of the sages regarding their concepts of God. For starters, she questioned why Paul Mbuya Akoko said there must be one god to account for the orderliness of the universe. According to Gichohi, Mbuya begged the question, for who is to say that many gods must take on a mischievous character? (89). She also noted that Mbuya said that no one really knows God but later affirmed that God exists and rules nature (91). She noted that Oruka Rang’inya was involved in a contradiction between God being a concept and God’s living in the wind (93). She further was concerned that M’Mukindia Kithanje’s interpretation of God as present at the biological process of procreation confused the mysterious or marvelous with God (94). When it came to their ideas for the improvement of society, Gichohi found some of the sages’ suggestions problematic. Gichohi was particularly concerned with Mbuya Akoko’s suggestion that a criminal should be administered a drug during which time he could be reformed. She expressed her skepticism that such a procedure would reform the individual. Since being subjected to such drugs involuntarily is dehumanizing, how could one be reformed while his humanity has been eroded? In addition, Mbuya did not explain what type of offender and under what circumstance the punishment should be administered. These are all very important objections to the procedure which were not even questioned during the interview (103-04). Likewise, when Simiyu said that illness is due to laziness, his view, although perhaps sometimes true, could not count for all cases, such as physical destruction and disease brought on by earthquakes and other large-scale calamities not caused by humans. (131-32).

Ochieng’-Odhiambo described in his thesis and subsequent articles that his efforts were aimed at exploring “philosophic sagacity” to prove to skeptics that Africans can philosophize. For this reason, he explained, “my efforts were channeled toward presenting the thoughts of some sages in an elaborate and rarefied manner. More specifically, I concentrated on those topics that had been the focus of most ancient Greek philosophers” (“The Tripartite” 18). By proceeding in such a way, he would not only “uncoil” the philosophical ideas and logic of the sages but also “show beyond the shadow of a doubt that philosophers existed in traditional Africa” (“The Tripartite” 19). As Ochieng’-Odhiambo explained in a 1997 article that presented some of his 1994 dissertation’s findings, “The rationale of my approach was that if the thoughts of the pre-Socratics are philosophical (and this is never doubted) and if the African (Kenyan) sages think in a similar manner, then they should also be granted the prestige of being philosophical” (“Philosophic” 174). Oruka himself made references to the sages being at least as good as the pre-Socratics (Sage 1990 ed., xv-xvi, xxv, 37), so Ochieng’-Odhiambo was clearly following Oruka’s lead. The rest of the article, based on the research he did for his dissertation, involved interviewing sages and asking them, for example, questions on change and permanence. Ochieng’-Odhiambo asked Rose Ondhewe Odhiambo whether things change or are permanent (in obvious reference to the Parmenides and Heraclitus paradox). She gave a nuanced answer: some things change more than they are permanent, and some are more permanent and change little. Certainly she used reason and put forward a rational view. Ochieng’-Odhiambo went on to interview a man, Naftali Ong’alo, who when asked what the single most important element is, argued that “water is the single most important thing in the universe” (“Philosophic” 175-77).

It’s possible to raise some methodological questions regarding the approach in Ochieng’-Odhaimbo’s early works. The problem of asking “leading questions,” whether pursued intentionally or not, is a real one for any interviewer; Ochieng’-Odhaimbo himself addressed the dangers of leading questions in another work of his (Trends 132-33). While his studies with Oruka were in the 1990s, he continued to address African Philosophy in general, and sage philosophy in particular, as a key topic in his philosophical writings. He gave a thorough account of Oruka’s sage project in his 2002 and 2006 articles, and in his 2010 book (Trends 115-150).

Patrick Maison Dikirr published some findings from his 1994 master’s thesis which he wrote under Oruka’s supervision. Dikirr interviewed Maasai sages on the topic of death. As Dikirr explained, by discussing death, certain ideas, values, or lessons were reinforced about life. There were ambiguous practices among the Maasais, some of which seemed to argue for an idea of the afterlife. For example, when a Maasai person saw a snake (black python or cobra) in a hut of someone who has recently died, they fed it milk, greeted it, and told it, “We are always together!” After all, the snake may be a deceased important person such as an oloiboni (diviner), a great chief or counselor, or a wealthy man. But Dikirr wondered further, were snakes fed just to avert their anger, so that humans could survive? Or, were there ethical lessons contained in the treatment of snakes, such as: do not despise strangers who may show up to one’s house? He preferred that these lessons be the real reason behind the stories. Likewise, Maasais thought that waking someone suddenly from deep sleep should be discouraged, because the spirit travels while sleeping. But, Dikirr preferred to understand this practice as a focus on the ethical values of politeness and humility toward others. Dikirr thought the Maasai conception of self was closer to the Aristotelean unitary self-experience. He found evidence to show that Maasais thought there was a permanent end to life. The dead are no longer around. The only thing left after death is how one’s personality affects the children. A person who has children will not easily fade from memory like the single person who dies without children. Here, immortality is understood as a name to remember.

Ngungi Kathanga wrote a master’s thesis on philosophic sagacity at UON in 1992. Seven male (and no female) sages, all Kikuyus from Kirinyaga district, were included in Kathanga’s study. He explained that he originally interviewed fifty women and men (he does not mention how many of the fifty were women), but only the seven men included were judged by him to be sages (96). He included three sages’ responses to questions of men and women’s equality. All three said men were superior to women. All pointed out her physical weakness, and some added other weaknesses. Mwangi Wangu stated that women are unable to keep secrets. But he said they are respected for their roles as child-bearers, because through the naming of children, the dead survive. Joel Rukenya said women cannot rise up to tough challenges in life, and therefore should not be put in positions of power (122-24). The sages are, however, quoted as supporting racial equality (128-131).

Regarding Oginga Odinga, Peter Ogola Onyango of Moi University claims that a philosophic sage must first become a folk sage before he or she can become a philosophic sage. He then argues that Oginga Odinga proves his ability to be a folk sage by the fact that he is chosen as Ker of the Luo. Ogola Onyango then shows that Oginga Odinga is a philosophic sage because he disagreed with popular opinion of many Luos during the S. M. Otieno burial trial, when he claimed that it is fine for Luos to be buried anywhere in Kenya (240-42).

Oriare Nyarwath analyzed several of Oruka’s sages on the topic of freedom (Nyarwath in Graness and Kresse 211-218). He went on to write a PhD thesis in 2009 on Oruka’s philosophical works which included his review of the sage philosophy project’s purpose and methodology, but he did not include interviews of sages or commentary upon Oruka’s sage interviews (139-161, 247-48). Instead, the thesis focused on the question of Oruka’s commitments and overarching themes throughout his published works.

Also, students at Tangaza College in Nairobi’s Maryknoll Institute of African Studies program were regularly offered a course in sage philosophy, earlier taught by Oruka himself, then by F. Ochieng’-Odhiambo, and later, by Oriare Nyarwath (Maryknoll “Sage Philosophy”). These students continued to interview sages; their reports can be found in the Tangaza College library. In the earlier years, that is, in the 1990s, reports were almost always accompanied by transcripts of the interviews. But after around 2000, the number of student papers containing the transcript of the interviews declined. Either students gave short quotes of the interviews, or they only referred to interviews without giving any direct quotes.

10. Sage Philosophy Research by Other Philosophers: Other Scholars

Kai Kresse’s book, Philosophising in Mombasa, got its inspiration from Oruka’s project. Kresse explained that he was seeking knowledge about knowledge in the context of the Muslim community living on Kenya’s Swahili coast. He wanted to study the self-reflexive, critical knowledge of local thinkers there. His book contained three in-depth portraits of local elder intellectuals and several briefer portraits of younger thinkers. Kresse explained how his methodology differed from Oruka’s. Unlike Oruka, Kresse did not center his study on direct questions put to each thinker interviewed, but instead observed the intellectuals during their philosophical discourses with members of their community. Kresse himself became fluent in Swahili so that he could follow such discussions directly, and read the scholars’ lectures, poetry and other writings. He lived in the Mombasa Old Town community so that he could be socially accepted and therefore placed in situations to hear and document the most interesting discussions. Kresse also helped his readers by describing the historical, religious and cultural context in which the debates occurred, as well as the personal biographies of the participants. But like Oruka and Brenner, Kresse saw a key part of his work as documenting “the utterances of the intellectuals” (31; Brenner). While Kresse added his own interpretation, he provided clear demarcation to his commentary, so that the reader could accept or reject the interpretations offered.

Kresse then followed with several chapters, each focusing on a particular thinker. Ahmed Sheikh Nabhany had as his goal the preservation of all that was good in Swahili traditions. Through poetry he was able to use his creative skills to communicate the basics of Islamic practices as well as moral guidelines and cultural practices. Nabhany was active in his proposals for preservation of a moral code that was losing ground in contemporary society. In his next chapter Kresse explored Ahmad Nassir, who in his poem “Utenzi wa Mtu ni Utu” summed up a moral code that involved respecting all human beings, that provided guidelines for distinguishing between good and bad actions, and that offered a way to measure moral status. Kresse considered Nassir to be an innovator insofar as he constructed a theory of utu (humaneness) and formulated sub-concepts that enforce utu. The next chapter focused on Sheikh Abdilahi Nassir’s Ramadan lectures. Kresse argued that Abdilahi was a sage, referring to Oruka’s use of the term in the context of his sage philosophy project. Abdulahi’s practice of rethinking his own positions on issues of dire importance to his community, and the extent of his conscious effort to clarify his ideas, made Abdulahi’s practices a clear example of philosophizing (206-07).

Kresse followed the book with an article in 2008 that engaged in a study of the concept of wisdom, based on two Swahili sages. He argued that a person is identified as wise if they are able to make others see the world in a different light or from a new perspective. He argued that wisdom required social performance and interaction (“Can,” 194, 199).

Workineh Kelbessa, a philosopher from Ethiopia who had met Oruka and was inspired by his project, used Oruka’s interview method to gain knowledge about environmental values among the Oromo of Ethiopia. He wrote a book about his findings. His work drew upon culture philosophy as well as the insights of philosophic sages. He explained, “In this work, the term ‘indigenous environmental ethics’ is used sometimes to refer to the ethical views of philosophic sages who have their own independent views, and in most cases it is used as a plural (of ‘environmental ethic’) to refer to the norms and values of various Oromo groups and of other indigenous peoples” (ch. 1). His objective was to “show how indigenous knowledge systems can serve as a critical resource base for the process of development and a healthy environment.” He cautioned that he did not intend to engage in uncritical, nostalgic acceptance of Oromo indigenous knowledge. He used various sources, but depended most upon “interviewing, focus group discussion and observation” because they “enable us to understand values and attitudes of the people towards the environment at a level inaccessible to a questionnaire.” He interviewed peasant farmers and pastoralists to learn about their concepts of time and divination, their ecotheology, and their attitudes toward wild animals, forests, and agriculture (ch. 1). His study drew upon many proverbs.

A further sage philosophy study which attempted to apply the insights gained from sage philosophy to the topic of a new national culture for Kenya was written by Chaungo Barasa, who helped Oruka conduct his sage philosophy interviews. Chaungo argued that cultural practices needed to be connected to consistent thoughts and belief systems. He suggested Kenyans re-examine their lives and cultures in five areas: the intersection/harmonization of tradition and modernity, death and burial ceremonies, marriage and inheritance, inter-family and clan relations, and leadership and role-modeling. All of this could be attained with the help of sage philosophy, which encouraged people to pursue wisdom and reflect on their beliefs. The family taught moral behavior, he noted; however, in Kenya’s modern families (making up about 35 percent of the population) there was, he argued, a lack of morality. “Modern” Kenyans, he wrote, held a flawed concept of modernity, equating it with European culture and religion, and their understanding of that culture was rudimentary and incoherent. Chaungo maintained that the modern Kenyan also had a stunted understanding of indigenous cultures and traditions; in their place were materialism, and consumerism, and status. They barely masked their distaste for rural folk and environment, Chaungo argued; yet, they engaged in gender oppression which contradicted modernity. Also, modern Kenyans were easily manipulated and bought by various politicians. Such a description showed that philosophical reflection upon tradition was mandatory in order for society to become productive and coherent.

Oral historian E. S. Atieno-Odhiambo’s article “Luo Perspectives on Knowledge and Development: Samuel G. Ayany and Paul Mbuya” (2000) analyzed and evaluated books and pamphlets written by these two sages. Paul Mbuya Akoko, interviewed by Oruka and included in Sage Philosophy, was also a writer. This article met the two criteria of quoting individual sages, and engaging in critical analysis. Since the sages addressed the topic of development, the thrust of the article also fit in with Oruka’s expressed goals for his sage philosophy project. Mbuya was not the only sage included in Oruka’s Sage Philosophy who had written down his own ideas, and yet Oruka did not analyze the written works of the sages he included in his study.

In his “Conversations with Luo Sages,” D. A. Masolo recorded a conversation of pressing issues of the day in which a sage takes center stage, and in which Masolo was a participant but did not direct the conversation. Masolo considered this an example of participant observation, which, according to some anthropologists, could be a more reliable source of texts for understanding African philosophy than interviews. Masolo included this conversation transcript in his book Self and Community (255-60) because it shed light on contemporary moral debate in Kenya. While not explicitly expressed, what “emerged” during the conversation was the question of whether the worth of abstract moral principles “ought to be judged independently of any real situation” (263). Masolo then further analyzed the issues raised, in the context of moral positions expressed by Kant, Hume, and Wiredu. In another part of the same work, Masolo drew upon the insights of a sage interviewed by Oruka, Paul Mbuya Akoko. He found these to express helpful ideas for grounding the ethics of communalism, described by the sage as, in Masolo’s words, “a norm arrived at for purposes of affecting order in the lives of people by reducing social differences and promoting peace” (50). Masolo could be seen as a contemporary advocate and practitioner of a variant of sage philosophy. His methods focused not on interviews of a sage by a researcher, but rather the analysis of discourse at various public fora in which the sages gathered, such as “palavers,” public debates and negotiations. In these contexts, sages used their mental skills and were involved in sustained critical inquiry (“Sage Philosophy”).

Richard Bell’s book, Understanding African Philosophy, devoted a section to Oruka’s sage philosophy. He wanted to take Oruka’s project further by exploring oral philosophy as an example of narrative and Socratic discourse found not only in the texts of sages but also in everyday discourse and village palavers (32-35, 111-12).). For Bell, philosophy in Africa had to be tied to the experience of the lived reality of Africa, which was made up of the pre-colonial traditions of Africa, and its colonial history, current harsh circumstances, and human struggles (35). Bell analogized to Plato’s dialogues, such as Euthyphro, where, in the context of everyday life, circumstances give rise to philosophical dilemmas. Sages similarly prompted to engage in discussion as well as deep thought, and they grappled with situations which gave rise to what Bell called the “narrative ‘stuff’ of philosophy” (112).

Bekele Gutema argued that sage philosophy’s method was particularly productive in exploring topics of conflict resolution, such as crises of democracy, problems of ruling elites and corruption, and ethnic strife. Sages emphasized solutions that addressed the needs and perspectives of all parties, having as their goals the harmony between people as well as between people and nature. He added what he knew about elders being involved in reconciliation from his own experience (208-11). Presbey interviewed sages with these themes in mind. She found sages in both Kenya and Ghana who shared their insights into conflict, whether interpersonal or ethnic, and their procedures for bringing estranged parties together. She quoted from her interviews with the sages and evaluated their insights (Presbey “Contemporary African Sages”; “Philosophic Sages”; “Sage Philosophy and Critical Thinking”).

Charles Verharen of Howard University engaged in a project which combined Oruka’s sage philosophy project with the methods of Claude Sumner, S.J., the scholar who studied Ethiopian philosophy while living there for 45 years. Verharen noted that Sumner, following the suggestion of Alain Locke, enlisted the aid of linguists and anthropologists to do his philosophical work, something that Oruka did not do, but that Verharen considered essential to his project. Verharen engaged in interviews both among the Oromo and, with the help of Rianna Oelofsen of University of Fort Hare, South Africa, among the Xhosa and San. Verharen explained that he was drawn to study sage philosophy out of concerns for cultural survival as well as philosophy’s survival, as he searched for “better stories to tell” in a world where human survival was jeopardized (83-88). He suggested interviewing both those known as sages and a broader group drawn from all parts of the society, questioning them in such a way as to reveal their level of critical rationality (75-76).

Kazeem likewise suggests that sage philosophy research should continue with slight modifications in order that philosophers can salvage “indigenous epistemologies threatened with extinction” and thereby contribute to a “polycentric global epistemology” (200). Kazeem names his approach “hermeneutico-reconstructionism” and asserts that it can be used to solve Africa’s current problems (200-01).

Oruka’s contribution to the field of African philosophy was substantial, and his influence is ongoing, as sage research continues.

11. References and Further Reading

Abraham, W. E. The Mind of Africa. Chicago: U of Chicago P, 1962.
Atieno-Odhiambo, E. S. “Luo Perspectives on Knowledge and Development: Samuel G. Ayany and Paul Mbuya.” African Philosophy as Cultural Inquiry. Ed Ivan Karp and D. A. Masolo. Bloomington: Indiana UP, 2000. 244–258. African Systems of Thought.
Azenabor, Godwin. “Odera Oruka’s Philosophic Sagacity: Problems and Challenges of Conversation Method in African Philosophy.” Premier Issue. Spec. issue of Thought and Practice: A Journal of the Philosophical Association of Kenya ns 1.1 (June 2009): 69-86.
Azenabor, Godwin. Understanding the Problems in African Philosophy. Second Edition. Lagos, Nigeria: First Academic Publishers, 2002.
Bell, Richard H. Understanding African Philosophy: A Cross-Cultural Approach to Classical and Contemporary Issues. New York: Routledge, 2002.
Bewaji, Tunde. Rev. of Sage Philosophy, ed. H. Odera Oruka. Quest: Philosophical Discussions 7.1 (June 1994): 104-111.
Brenner, Louis. West African Sufi: The Religious Heritage and Spiritual Search of Cerno Bokar Saalif Taal. London: C. Hurst, 1984.
Chaungo, Barasa. “Narrowing the Gap between Past Practices and Future Thoughts in a Transitional Kenyan Cultural Model, for Sustainable Family Livelihood Security (FLS).” Presbey, et al. Thought and Practice 217–222.
Cotran, E. “The Future of Customary Law in Kenya.” The S. M. Otieno Case: Death and Burial in Modern Kenya. Ed. J. B. Ojwang and J. N. K. Mugambi. Kenya: Nairobi UP, 1989. 149-165.
Dikirr, Patrick Maison. “The Philosophy and Ethics Concerning Death and Disposal of the Dead Among the Maasai.” MA Thesis U of Nairobi, 1994.
Donders, J. G. “Don’t Fence Us In: The Liberating Role of Philosophy.” 11th Inaugural Lecture. University of Nairobi. 10 March 1977. Nairobi: Joseph Gerard Publication, U of Nairobi, 1977.
Eze, Emmanuel, and Rick Lewis. “African Philosophy at the Turn of the Millennium: Rick Lewis in Dialogue with Emmanuel Chukwudi Eze.” Polylog: Forum for Intercultural Philosophizing 1.1 (2000): 1-28.
Gichohi, Wairimu. “Indigenous African Philosophical Knowledge: A Critique.” MA Thesis U of Nairobi, 1996.
Goody, Jack. Rev. of Conversations with Ogotemmeli: An Introduction to Dogon Religious Ideas, by Marcel Griaule. American Anthropologist n.s. 69.2 (April 1967): 239-41.
Graness, Anke, and Kai Kresse, eds. Sagacious Reasoning: Henry Odera Oruka in Memoriam. Frankfurt am Main: Lang, 1997. Nairobi: East African Educational, 1999. (Page numbers are the same).
Gutema, Bekele. “The Role of Sagacity in Resolving Conflicts Peacefully.” Presbey et.al., Thought and Practice 207-216.
Gyekye, Kwame. An Essay on African Philosophical Thought: The Akan Conceptual Scheme. Rev. ed. Philadelphia: Temple UP, 1995.
Hallen, Barry. African Philosophy: The Analytic Approach. Trenton, NJ: Africa World P, 2006.
Hallen, Barry. A Short History of African Philosophy. 2nd ed. Bloomington: Indiana UP, 2009.
Hallen, Barry. “Yoruba Moral Epistemology,” A Companion to African Philosophy. Ed. Kwasi Wiredu. Malden, MA: Blackwell, 2004. 296-303.
Hallen, Barry, and J. Olubi Sodipo. Knowledge, Belief, and Witchcraft: Analytic Experiments in African Philosophy. Stanford: Stanford UP, 1997. Mestizo Spaces / Espaces Métissés.
Hord, Fred Lee, and Jonathan Scott Lee, eds. I Am Because We Are: Readings in Black Philosophy. Amherst: U of Massachusetts P, 1995.
Hountondji, Paulin J. African Philosophy: Myth and Reality. Trans. Henri Evans and Jonathan Rée. 2nd ed. Bloomington: Indiana UP, 1996. (Note: African Philosophy was first published in French in 1976 by François Maspero, Paris. Its first English edition came out in 1983. Citations refer to the second English edition).
Hountondji, Paulin J. “Introduction: Recentring Africa,” Endogenous Knowledge: Research Trails. Ed. Paulin Hountondji. Dakar, Senegal: CODESRIA, 1997, 1-39.
Hountondji, Paulin J. “La philosophie et ses revolutions.” Cahiers philosophiques africains/African Philosophical Journal 3-4 (1973): 27-40.
Hountondji, Paulin J. The Struggle for Meaning: Reflections on Philosophy, Culture, and Democracy in Africa . Athens, Ohio: Ohio UP, 2002. Research in International Studies, Africa Series 78.
Imbo, Samuel Oluoch. An Introduction to African Philosophy. Lanham, MD: Rowman and Littlefield, 1998.
Janz, Bruce B. Philosophy in an African Place. Lanham, MD: Lexington, 2009.
Kalumba, Kibujjo. “Sage Philosophy: Its Methodology, Results, Significance, and Future.” A Companion to African Philosophy. Ed. Kwasi Wiredu. Malden, Mass.: Blackwell, 2004. 274–282.
Kathana, Ngungi. “Philosophic Sagacity in Africa.” MA Thesis U of Nairobi, 1992.
Kazeem, Fayemi Ademola “H. Odera Oruka and the Question of Methodology in African Philosophy: A Critique.” Thought and Practice: A Journal of the Philosophical Association of Kenya ns 4.2 (December 2012): 185-204.
Kelbessa, Workineh. Indigenous and Modern Environmental Ethics: A Study of the Indigenous Oromo Environmental Ethic and Modern Issues of Environment and Development. Washington, D.C.: Council for Research and Values in Philosophy, 2009.
Kelbessa, Workineh. “Logic in Ethiopian Philosophical and Sapiential Literature.” Sumner and Yohannes 109-116.
Kresse, Kai. “Can Wisdom be Taught?: Kant, Sage Philosophy, and Ethnographic Reflections from the Swahili Coast.” Teaching for Wisdom : Cross-Cultural Perspectives on Fostering Wisdom. Ed. Michel Ferrari and Georges Potworowski. Dordrecht: Springer, 2008. 187-202.
Kresse, Kai. Philosophising in Mombasa: Knowledge, Islam and Intellectual Practice on the Swahili Coast. Edinburgh: Edinburgh UP, 2007. International African Library 35.
Makinde, M. Akin. African Philosophy, Culture, and Traditional Medicine. Athens, Ohio: Ohio U Center for International Studies, 1988.
Makinde, M. Akin. “Philosophy in Africa.” Momoh Substance 88-125.
Makinde, M. Akin.“Robin Horton’s ‘Philosophy’: An Outline of Intellectual Error,” University of Ife, Nigeria. 20 June 1978. Reading.
Maryknoll Institute of African Studies. “Sage Philosophy: The Root of African Philosophy and Religion.” 2013-14 Course Catalog.
Masolo, D. A. African Philosophy in Search of an Identity. Bloomington: Indiana UP, 1994.
Masolo, D. A. “African Sage Philosophy,” Stanford Encyclopedia of Philosophy. Spring 2006 ed.
Masolo, D. A. “Conversations with Luo Sages.” An African Practice of Philosophy. Spec. issue of SAPINA: A Bulletin of the Society for African Philosophy in North America 10.2 (1997): 249–264.
Masolo, D. A. “Sage Philosophy,” New Dictionary of the History of Ideas. Farmington Hills, MI: Gale, 2005.
Masolo, D. A. Self and Community in a Changing World. Bloomington: Indiana UP, 2010.
Momoh, Campbell Shittu. “An African Conception of Being and the Traditional Problem of Freedom and Determinism.” Ph.D. dissertation, Indiana University, Bloomington, July 1979.
Momoh, Campbell Shittu. “African Philosophy: Does It Exist?” Diogenes 33.130 (April 1985): 73-104.
Momoh, Campbell Shittu. “Modern Theories in an African Philosophy.” The Nigerian Journal of Philosophy 1.2 (1981): 8-25.
Momoh, Campbell Shittu, ed. The Substance of African Philosophy. Auchi, Nigeria: African Philosophy Projects, 1989. See esp. “Issues in African Philosophy” by Momoh.
Munyakho, Dorothy. “No Easy Road to Family Planning.” Daily Nation (Nairobi), 3 April 1990: 21.
Nation Reporter. “Scholar Dies in Accident.” Daily Nation (Nairobi) 15 December 1995): 40.
Ndaba, W. J. “Odera Oruka’s Sage Philosophy: Individualistic vs. Communal Philosophy.” Beyond the Question of African Philosophy: A Selection of Papers Presented at the International Colloquia, UNISA, 1994–1996. Ed. A. P. J. Roux and P. H. Coetzee. Pretoria : U of South Africa P, 1996.
Nordenstam, Tore. Sudanese Ethics. Uppsala, Sweden: Scandinavian Institute of African Studies, 1968.
Nyarwath, Oriare. “Odera Oruka’s Philosophy: A Search for his Philosophical Commitment.” PhD thesis U of Nairobi, 2009.
Nyarwath, Oriare.“Philosophy and Rationality in Taboos, with Special Reference to the Kenyan Luo Culture”. MA thesis U of Nairobi, 1994.
Ochieng, Omedi. “The Epistemology of African Philosophy: Sagacious Knowledge and the Call for a Critical and Contextual Epistemology.” International Philosophical Quarterly 48.3 (September 2008): 337-59.
Ochieng, Omedi. “The Ideology of African Philosophy: The Silences and Possibilities of African Rhetorical Knowledge.” Silence and Listening as Rhetorical Arts. Ed. Cheryl Glenn and Krista Ratcliffe. Carbondale: Southern Illinois University Press, 2011. 147-162.
Ochieng’-Odhiambo, Fredrick. “The Evolution of Sagacity: the Three Stages of Odera Oruka’s Philosophy.” Philosophia Africana 5.1 (March 2002): 19-32.
Ochieng’-Odhiambo, Fredrick. “Philosophic Sagacity: Aims and Functions.” Caribbean Journal of Philosophy 1.1 (2009).
Ochieng’-Odhiambo, Fredrick. “Philosophic Sagacity Revisited.” Graness and Kresse 171-79.
Ochieng’-Odhiambo, Fredrick. “The Significance of Philosophic Sagacity in African Philosophy”. PhD thesis, U of Nairobi, 1994.
Ochieng’-Odhiambo, Fredrick. “Some Basic Issues about Philosophic Sagacity: Twenty Years Later.” Sumner and Yohannes 64-75.
Ochieng’-Odhiambo, Fredrick. Trends and Issues in African Philosophy. New York: Peter Lang, 2010.
Ochieng’-Odhiambo, Fredrick. “The Tripartite in Sagacity,” Philosophia Africana 9.1 (March 2006): 17-34.
Oruka, Henry Odera. “Ethics, Beliefs, and Attitudes Affecting Family Planning in Kenya Today.” Report. Inst. of Population Studies, U of Nairobi, 1989.
Oruka, Henry Odera. “The Fundamental Principles in the Question of ‘African Philosophy’ I.” Second Order 4.1 (January 1975): 44-55.
Oruka, Henry Odera. “Grundlegende Fragen der Afrikanischen Sage-Philosophy.” Vier Fragen Zur Philosophie in Afrika, Asien und Lateinamerika. Ed Franz M. Wimmer. Vienna: Passagen, 1988. 35ff.
Oruka, Henry Odera. “Paul Mbuya the Sage Philosopher.” Sunday Nation 17 May 1981: 17.
Oruka, Henry Odera. “The Philosophical Roots of Culture in Western Kenya.” Grant proposal, in possession of the author. [1979?] (For corroboration of the date, see Ochieng’-Odhiambo 2002, 29).
Oruka, Henry Odera. “Philosophy and the Search for a National Culture.” Sunday Nation 31 August 1980: 30.
Oruka, Henry Odera.”Philosophy in English Speaking Africa.” Nuova Secondaria (Rome) 10 (1984): n. pag.
Oruka, Henry Odera. Practical Philosophy: In Search of an Ethical Minimum. Nairobi: East African Educational, 1997.
Oruka, Henry Odera. Oginga Odinga: His Philosophy and Beliefs. Nairobi: Initiatives Publishers, 1992.
Oruka, Henry Odera. Rev. of Sudanese Ethics, by Tore Nordenstam. East Africa Journal June 1971: 37-38.
Oruka, Henry Odera. “Sagacity in African Philosophy.” International Philosophical Quarterly 23.4 (December 1983): 383–93.
Oruka, Henry Odera, ed. Sage Philosophy: Indigenous Thinkers and Modern Debate on African Philosophy. Leiden: E. J. Brill, 1990. Nairobi: ACTS Press, 1991.
Oruka, Henry Odera. Trends in Contemporary African Philosophy. Nairobi: Shirikon, 1990.
Ogola Onyango, Peter. “A Continuing Study of Sage Philosophy: Emphasis on Jaramogi Oginga Odinga.” Presbey et al. Thought and Practice. 239-243.
Ogot, Bethwell A. “The Construction of a National Culture.” Decolonization and Independence in Kenya, 1940-1993. Ed. B. A. Ogot and W. R. Ochieng’. Athens: Ohio UP, 1995. 214-236.
Oseghare, Anthony. ‘‘Sagacious Reasoning in African Philosophy.“ PhD thesis U of Nairobi, 1985. Abstract. U of Nairobi Digital Repository.
Oseghare, Anthony. “Sagacity in African Philosophy.“ International Philosophical Quarterly 32.1 (March 1992): 95-104.
“Philosophic.” Collins English Dictionary. 10th ed. Dictionary.com.
Presbey, Gail. “African Sage Philosophy and Socrates: Midwifery and Method.” International Philosophical Quarterly 42.2 (June 2002): 177–192.
Presbey, Gail.“Attempts to Create an Inter-ethnic and Inter-generational ‘National Culture’ in Kenya.” Diogenes 11 July 2013: n. pag.
Presbey, Gail. “Conflict Resolution: Insights of Refugees at Dadaab Refugee Camp, Kenya.” Acorn: Journal of the Gandhi-King Society 12.1 (Spring-Summer 2003): 25-37.
Presbey, Gail. “Contemporary African Sages and Queen Mothers: Their Leadership Roles in Conflict Resolution.” Peacemaking: Lessons from the Past, Visions for the Future. Ed. Judith Presler and Sally Scholz. Amsterdam: Rodopi, 2000. 231–245.
Presbey, Gail.“Kenyan Sages on Equality of Sexes.” Odera Oruka Seventeen Years On. Spec. issue of Thought and Practice: A Journal of the Philosophical Association of Kenya (PAK) n.s. 4.2 (December 2012): 111-145.
Presbey, Gail. “Odera Oruka’s Position on Equality of Women: the Kenyan Context,” Henry Odera Oruka Symposium. University of Nairobi/ Goethe Institute. Nairobi, 19-21 November 2013. Address.
Presbey, Gail. “Philosophic Sages in Kenya Debate Ethnicity’s Role in Politics.” Ethnicity in an Age of Globalization. Ed. D. Carabine and L. L. Ssemusu. Nkozi: Uganda Martyrs UP, 2002. 161–183.
Presbey, Gail. “Sage Philosophy and Critical Thinking: Creatively Coping with Negative Emotions.” International Journal of Philosophical Practice 2.1 (Spring 2004): 1–20.
Presbey, Gail. “Sage Philosophy: Criteria that Distinguish it from Ethnophilosophy and Make It a Unique Approach within African Philosophy.” Philosophia Africana 10:2 (August 2007): 127-160.
Presbey, Gail. “Who Counts as a Sage? Problems in the Further Implementation of Sage Philosophy,” Quest: Philosophical Discussions, 11.1 and 2 (June and December 1997): 53–65.
Presbey, Gail, Dan Smith, Pamela Abuya, and Oraire Nyarwath, eds. Thought and Practice in African Philosophy : Selected Papers from the Sixth Annual Conference of the International Society for African Philosophy and Studies (ISAPS). Nairobi: Konrad Adenauer-Stiftung, 2002. Occasional Papers: East Africa 5.
Radin, Paul. Primitive Man as Philosopher. New York: D. Appleton, 1927. 2nd rev. ed. New York: Dover, 1957.
Sumner, Claude. Proverbs, Collection and Analysis. Vol. 1 of Oromo Wisdom Literature. Addis Ababa: Gudina Tumsa Foundation, 1995.
Sumner, Claude, and Samuel Wolde Yohannes, eds. Perspectives in African Philosophy: An Anthology on “Problematics of an African Philosophy: Twenty Years Later,” Ethiopia: Addis Ababa UP, 2002.
Towett, Taaita. “Le Role d’un philosophie Africain.” Spec. issue of Presence Africaine 27-28 (August-November 1959): 108-128.
Van Hook, Jay M. “Kenyan Sage Philosophy: A Review and Critique.” Philosophical Forum 27.1 (Fall 1995): 54-65.
Wanjohi, Gerald. The Wisdom and Philosophy of African Proverbs: The Gikuyu World-View. Nairobi: Paulines, 1997.
Wiredu, J. E. (Kwasi). Cultural Universals and Particulars: An African Perspective. Bloomington: Indiana UP, 1996.
Wiredu, J. E. (Kwasi). “On an African Orientation in Philosophy.” Second Order: An African Journal of Philosophy 1.2 (July 1972): 3-13.
Wiredu, J. E. (Kwasi). Philosophy and an African Culture. Cambridge: Cambridge UP, 1980.
Wiredu, J. E. (Kwasi). “What is Philosophy?” Universitas (Accra) 3.2 (March 1974).

Author Information

Gail M. Presbey
Email: presbegm@udmercy.edu
University of Detroit Mercy
U. S. A.

Desert

Desert is a normative concept that is used in day-to-day life. Many believe that being treated as one deserves to be treated is a matter of justice, fairness, or rightness. Although desert claims come in a variety of forms, generally they are claims about some positive or negative treatment that someone or something ought to receive. One might claim that a hard-working employee deserves a raise, an exceptional student deserves an academic scholarship, a dishonest politician deserves to lose an election, or a thief deserves to be imprisoned. But while such appeals to desert are common, there are a number of unsettled issues regarding the concept of desert itself and its relevance to justice. For example, it is common for people to claim that things other than humans, such as nonhuman animals or inanimate objects, can be deserving. How should we assess such claims? Some argue that desert presupposes responsibility. But must this be the case? According to some theories, desert is an important component of justice. Yet according to other theories, it has little or no role in justice. Some even question whether desert itself is a defensible concept. This article is designed to capture the scholarly agreement about these and other issues regarding desert. Where there is not such agreement, overviews of some of the competing accounts are presented.

The Structure of Desert
Desert and Some Related Concepts
1. Merit
2. Entitlement
The Role of Desert in Justice
1. Desert in Distributive and Retributive Justice
2. Desert, Institutions, and Justice
Meritocracy
Some Arguments against Desert
Concluding Remarks
References and Further Reading

1. The Structure of Desert

It is widely held that desert is a relation among three elements: a subject, a mode of treatment or state of affairs deserved by the subject, and some fact or facts about the subject, which are often referred to as desert base or desert bases (McLeod 1999a, 61-62; Pojman 2006, 21; Sher 1987, 7). This relation is shown in the formula:

S deserves M in virtue of B,

where S is the subject, M is the mode of treatment, and B is the desert base or bases. Each of these elements will be examined in greater detail.

a. Deserving Subjects

One’s view about who or what are the appropriate subjects of desert is going to be influenced by one’s view about what desert requires on the part of a subject. If one thinks that merely having a quality or feature is sufficient to establish desert, then one will place few restrictions on the kinds of things that can be deserving. If one thinks that having some baseline self-awareness is sufficient to make one the appropriate subject of desert, then nonhuman animals such as bottlenose dolphins and chimpanzees can be appropriate bearers of desert. If one thinks that desert requires a certain level of responsibility, then one will advocate for a conception that places stricter limits on who or what qualify as deserving subjects. While there is some disagreement in the literature, most who theorize about desert view human beings, or at least some subset of human beings, as appropriate subjects of desert A very broad conception of desert might seek to extend the concept to apply to certain or all sentient creatures, living things in general, or even inanimate objects. In fact, common language usage seems to support such a broad understanding. One might claim that Gone with the Wind deserves its reputation as one of the greatest movies ever made or that K2 deserves its reputation as one of the most difficult mountains to climb. But such a broad understanding of desert might involve problematic conflations of desert with other concepts. For example, while one might think Gone with the Wind’s lofty reputation is appropriate, one might argue that, strictly speaking, its reputation is not deserved. Instead, one might argue that in the cases of movies, mountains, and the like, the proposed desert claims are best understood as nothing more than general claims about how something should be judged or about what something should have or receive. So, in an effort to maintain conceptual clarity, it might be best to attribute some common uses of the term ‘desert’ to inexact language usage. A survey of the literature suggests some support for both broader (Schmidtz 2002, 777) and narrower uses of the term (Miller 1999, 137-138).

b. Deserved Modes of Treatment

Subjects are said to deserve a wide variety of things. The modes of treatment or states of affairs that one can deserve can be classified as positive or negative outcomes, harms or benefits, or gains or losses (Kristjánsson 2003, 41). Positive modes of treatment include such things as awards, compensation, good luck, jobs, praise, prizes, remuneration, rewards, and success. Negative modes of treatment include such things as bad luck, blame, censure, failure, fines, and punishment. Oftentimes, a deserved mode of treatment will incorporate a source or supplier of that treatment. For example, one might argue that an athlete deserves praise from his manager. But such a source need not be specified in all cases since legitimate desert claims need not be directed toward any source. This is, in part, because legitimate desert claims need not be enforceable or even prescribe any action. Consider the claim that certain hardworking people deserve good fortune. While this is a legitimate desert claim, it need not be directed toward any source and it need not result in a call for any corrective action in cases in which particular hardworking people have not had good fortune (Kekes 1997, 124).

c. Desert Bases

There are a variety of ways in which desert bases can be categorized. Two categories that are commonly used in the philosophical literature are desert based on effort and desert based on performance. Some accounts of desert focus primarily on one’s effort toward achieving some goal. Usually the goal has to be viewed as worthwhile, since quixotic effort is rarely considered to be a basis for desert. Some argue that desert is not based solely, or even primarily, on effort, but also on one’s performance in a given context. The performance can be any number of activities that give rise to positive or negative evaluation, such as the winning of a race or performing poorly in a music competition. In some contexts, the performance can be assessed in terms of the contribution that one makes as a part of some group, such as a family, company, community, or even a society as a whole. Depending on the context, this contribution can be measured in terms of productivity, success, or some other similar measure. Michael Boylan presents a thought experiment that raises questions concerning how one’s effort and performance often are, and how they should be weighed as factors in determining one’s desert. We are presented with two puzzle makers. The first puzzle maker is presented with a puzzle that is 80 percent complete, and he finishes the puzzle by completing the remaining 20 percent. The second puzzle maker is presented with a puzzle that is totally incomplete. He manages to complete 80 percent of the puzzle, and therefore does not finish it (2004, p. 139 ff). Boylan notes that, according to a common interpretation, the first puzzle maker would be the one who deserves the credit, and the resultant spoils, for completing the puzzle. But why should this puzzle maker get more credit when he completed significantly less of the puzzle? He cannot claim credit for, and therefore cannot claim to deserve, receiving the puzzle in a more advanced stage of completion, since he did nothing to bring the puzzle to that stage of completion. The puzzle maker example highlights important issues regarding the nature and use of desert. First, there is the question of what basis or bases one should use to determine desert. Should effort, performance, or some combination of the two be used? Are there other criteria that ought to be used? Second, even if one determines that effort and performance are the relevant desert bases, then one must still determine how to correctly weigh the two in a given situation.

i. Desert and Responsibility

As noted above, one’s view about who or what can qualify as a deserving subject will be influenced by one’s view of the role of responsibility in establishing desert. Some have argued that at least some type of responsibility is a necessary condition for all desert (Smilansky 1996a, 1996b), whereas others have argued that, in at least some cases, one can deserve some mode of treatment without anyone being responsible for the desert base that gives rise to that mode of treatment (Feldman 1995, 1996). An example of responsibility without desert could be cases in which a victim of theft is said to deserve compensation even though he was not responsible for having his money stolen. In such a case, however, there is still someone, namely the thief, who is responsible for the desert base. Others might offer desert claims based on suffering that people endure at the hands of beings with dubious levels of responsibility, such as children, mentally handicapped or emotionally disturbed adults, and nonhuman animals. Some argue that there can be desert in cases in which the suffering is not caused by any being, such as when people suffer as the result of a natural phenomenon. One who supports this view might argue that a tornado victim can deserve financial support as a result of his suffering through that natural disaster. So, one can argue that while certain cases of desert require responsibility, not all do. In at least some cases, one can attempt to maintain a connection between desert and responsibility by appealing to a notion of negative responsibility. That is, one can argue that if someone suffers a misfortune for which she is not responsible, and this misfortune causes her to fall below some baseline condition, then she can deserve some treatment as a result of her suffering (Smilansky 1996a, 1996b). Alternatively, one could argue that cases like those of the crime and tornado victims are not cases of genuine desert. One might argue that in situations in which a person suffers through no fault of her own she might be due compensation, and while it is a matter of justice whether she receives compensation, strictly speaking she does not deserve compensation.

ii. Desert and Time

Most desert theorists argue that desert is strictly a backward-looking concept. According to this standard view, a person’s desert is based strictly on past and present facts about him (Rachels 1997, 176; Feinberg 1970, 72; Miller 1976, 93). The view that desert must be backward looking has been challenged, however. According to these alternative, forward-looking accounts, certain legitimate desert claims can be based on future performances (Feldman 1995, Schmidtz 2002). This forward-looking view has been questioned based in part on a concern that it relies on instances of desert without legitimately grounded desert bases. The argument is that in order for a person to deserve something at a given time there must be some relevant fact about the person at that time that gives rise to his desert. The concern is that a desert base with sufficient grounding conditions that lie in the future cannot be such a fact, for it is metaphysically dubious (Celello 2009, 156).

2. Desert and Some Related Concepts

Desert is one of many concepts that are used to assess the appropriateness of what one does or should have. Prior to discussing the role of desert in justice, it is worthwhile to consider a couple of these other concepts.

a. Merit

There is not a consensus on how to understand the relationship between desert and merit. Some argue that the terms ‘desert’ and ‘merit’ do not identify separate concepts. And, in ordinary language, the two are often used interchangeably (McLeod 1999a, 67). But many scholars have offered important distinctions between the two concepts. One way to distinguish between the two is to claim that merit should understood more broadly than desert, since merit results from any quality or feature of a subject that serves as a basis for the positive or negative treatment of that subject even if that treatment is not strictly speaking deserved. On this account, desert is a species of the genus merit (Pojman 1997, 22-23). Although scholars discuss other distinguishing factors, e.g. effort and intention, a main factor used to distinguish desert from merit is responsibility. David Miller claims that a distinction between desert and merit is supported by the ways in which the two are discussed in contemporary discourse (1999, 125). He notes that ‘merit’ is used to refer to a person’s admirable qualities whereas ‘desert’ is used in cases in which someone is responsible for a particular result. One who supports such a distinction might claim that a person can merit treatment based on factors over which he has little or no control, based on characteristics that he did little to develop, and based on performances that required very little effort. For example, a man can merit, but not deserve, admiration for his native good looks. In addition, since merit does not require responsibility, it can apply to a wide variety of things, including nonhuman animals and even inanimate objects.

b. Entitlement

Understood in one way, entitlement claims are specific to particular associations, organizations, or institutions. Entitlement results from a subject having a claim or right to some treatment as a result of following the rules or meeting some explicit criterion or criteria of an association, organization, or institution. Although certain entitlements might be related to or give rise to desert (McLeod 1999b, 192), it is important to keep the two concepts distinct. There are many situations in which one deserves some treatment without being entitled to that treatment or in which one is entitled to something that one does not also deserve. Consider an automobile race in which the leading driver is caused to wreck by debris on the track. As a result, he crashes just prior to crossing the finish line. In such races, crossing the finish line first is the criterion used to establish the winner. If the crash prevented the driver from winning, one could reasonably argue that, although the driver is not entitled to win, he deserved to win because he had made the requisite effort, performed better than all of the other drivers for the entire race leading up to the crash, and was clearly going to win before he crashed. In addition to the fact that one can deserve something that one is not entitled to, one can be entitled to something that one does not deserve. Based on the laws of his country, an evil dictator could be entitled to a subject’s property that the dictator seized on a whim, but this does not mean that the dictator deserves the property. To use another common example, a son might be entitled to an inheritance left to him by his father, but he might not have done anything to deserve that inheritance.

3. The Role of Desert in Justice

In a general sense, justice can be understood to consist in persons getting what is appropriate or fitting for them. This idea of justice can be traced back to ancient times. Plato discussed justice in general, and distributive justice in particular, as involving a type of appropriateness or fittingness of treatment (Republic 1.332bc). According to some translations of Laws, Plato suggested that justice involves treating people as they deserve to be treated (6.757cd). Although there are many important differences between their theories, Aristotle joined Plato by arguing that justice involves a type of equality. In Nicomachean Ethics, Aristotle maintained that distributive justice involves judging people according to certain criteria in order to determine whether they are equal or unequal. He argued that, in distributions, it is just for equals to receive equal shares, unjust for equals to receive unequal shares, and unjust for those who are unequal to receive equal shares. He maintained that what each person receives should be geometrically proportional to the degree or extent to which his or her actions fit or match these criteria (5.3.1131a10-b16). People are judged based on normative concepts such as desert, merit, and entitlement to determine whether they are equal or unequal. Consider a distributive context in which two people are to be treated based on what each deserves. According to the idea of geometrical proportionality, if one person is twice as deserving as the other, then she ought to receive twice the share of what is to be distributed. According to the classical tradition, desert is one of the conceptual components of justice. But it is not understood as being the only conceptual component of justice. The Greek word axia, a word used by both Plato and Aristotle in their discussions of the distribution of things such as goods, honors, and services, can be translated as, or understood to include, “desert”. But, in certain contexts, it might be misleading to translate axia as ‘desert’ instead of translating it as ‘merit’ or some other related concept (Miller 1999, 125-126). Desert has a prominent role in certain more recent conceptions of justice, such as those of John Stuart Mill and Henry Sidgwick. In Utilitarianism, Mill claimed that it is considered just when a person gets whatever good or evil he deserves and unjust when he receives a good or suffers an evil that he does not deserve (2001, 45). Sidgwick argued that justice involved one’s desert being requited (1907, 280 ff). According to some contemporary theories of justice, often referred to as “pluralist” theories, desert is one among other important conceptual components of justice. These other components can include, but need not be limited to, entitlement, equality, merit, need, reciprocity, and moral worth. According to these theories, whether and to what extent desert is relevant to justice depends on the context in which the judgment is being made. And, when desert conflicts with the other components of justice, it must be measured against them in order to determine what justice requires (Miller 1999, 133; Schmidtz 2006, 4).

a. Desert in Distributive and Retributive Justice

Some scholars argue that desert’s role in distributive justice and retributive justice is symmetrical, i.e., that desert is more or less equally relevant in both (Sher 1987; Pojman 2006, 126). There is disagreement in the literature as to whether desert’s role ought to be understood in this way (Moriarty 2003; Smilansky 2006). Those who argue in favor of an asymmetry in desert’s role may attempt to explain the asymmetry in different ways. Some might argue that desert is relevant in retributive justice but not in distributive justice because being the appropriate recipient of a harm requires a level of responsibility that being the appropriate recipient of a benefit does not. Or, some might argue in favor of the asymmetry based on the differing modes of treatment that are called for in distributive and retributive contexts. The motivating idea used to support this view is that desert is an appropriate and important basis for punishment, but other concepts, e.g. equality and need, are the appropriate bases for distributions of goods and services. Even if one recognizes desert as an important conceptual component of both distributive and retributive justice, one might argue that desert differs in these different spheres. For example, one might argue that desert in distributive justice can be forward looking, while desert in retributive justice cannot (Feldman 1995, 74-76; Schmidtz 2002, 783-784).

b. Desert, Institutions, and Justice

In many cases, what one is said to deserve is connected to a certain convention or practice within an association, organization, or larger social institution. One cannot deserve first place in an automobile race if there are not any such competitions, nor can an employee at a steel mill deserve a raise absent the existence of the steel mill and the economic system of which the steel mill is some very small part. In the light of such examples, some scholars claim that, if it is a defensible concept at all, desert cannot exist in the absence of such institutional conventions or practices (Cummiskey 1987). This idea leads some scholars to offer what they view as an important distinction between pre-institutional desert (p-desert) and institutional desert (i-desert). Those who recognize p-desert argue that although specific desert bases or deserved modes of treatment are often defined within a particular associational, organizational, or institutional context, desert is a concept that is logically prior to and independent of both tacit and explicit institutional criteria and rules. They argue that the conflation of p-desert with i-desert is based on a failure to recognize the distinction between desert as a general normative concept and a particular type of desert that is influenced by institutions. According to this view, the distinction between p-desert and i-desert is based on an important difference between one deserving something regardless of whether one is a part of an institution and deserving a specific thing based mostly or wholly on institutional criteria or rules. The reason why someone deserves a specific trophy made of a specific material for his effort and performance toward winning a particular automobile race is because there is an institution that holds and regulates such an event. But the underlying reason why the person deserves something for winning the automobile race is that, pre-institutionally, effort and performance give rise to desert. Some argue that rejecting p-desert is problematic since, without it, there is no independent normative concept of desert. That is, there is no concept of desert that is external to any given institution which can be used to evaluate the justice of institutions. Another difficulty with the rejection of p-desert is that it would disallow the seemingly reasonable claim that a person can deserve something even if she is not a part of any identifiable institution. One could argue that a person could deserve something in a state of nature or that she could deserve something even if she were the last person on Earth. If she were to work hard to build a shelter and grow crops, for example, one could argue that she thereby deserves the benefits that resulted from those activities. Some who argue that John Rawls’s theory of justice as fairness allows for desert in distributive contexts interpret his theory as advancing a purely institutional conception of desert. Samuel Scheffler (2000) argues that Rawls rejects prejusticial desert and not pre-institutional desert, however. According to Scheffler, Rawls rejects prejusticial desert because Rawls thinks that desert can exist only after the principles of justice have been established. Scheffler interprets Rawls as arguing that a person deserves whatever it is that justice dictates he should receive and only what justice dictates he should receive. On this view, desert is not prejusticial since desert is defined in terms of justice as opposed to justice being defined, at least in part, in terms of desert. But justice is understood as being pre-institutional since justice is a normative concept, external to any particular institution, which can be used to judge institutions. The rejection of prejusticial desert will be viewed as problematic by those who, following more traditional conceptions of justice, define justice, at least in part, in terms of desert. The concern is that defining desert in terms of justice, instead of defining justice in terms of desert, results in a backward understanding of the relationship between the two concepts.

4. Meritocracy

In general, a meritocracy is a social system in which advancement, reward, and status are based on individual abilities and talents. In theory, those who are more able and talented would advance further, reap greater rewards, and achieve loftier status. Meritocracy can involve attempting to erect a basic structure of society according to the ideas of a meritocracy or it can involve attempting to implement a system in which a society’s basic institutions are governed, at least in part, by principles of awarding jobs and specifying rewards for jobs on the basis of merit. Although the two issues are sometimes conflated, Norman Daniels notes that whether someone merits a job is separate from what rewards are attached to that job. So, while a person might merit a particular job of great importance, one should not assume that he merits higher wages or greater rewards than another person who merits a job of much less importance (Daniels, 218-219). As discussed above, there is some scholarly disagreement about the relationship between merit and desert. For those who offer clear distinctions between the two, a social system in which advancement, reward, and status were based on desert would be different from one in which such benefits were based on merit. A system of merit would be based on persons’ abilities and talents, whereas a system based on desert would focus on persons’ efforts and performances for which they are responsible. As a result, although the creation of either would be difficult, the creation of a system based on desert, a “desertocracy” if you will, seems to be more problematic than one based on merit. This is because a desertocracy would seem to require more, and more specific, information about persons than would a meritocracy.

5. Some Arguments against Desert

While many consider desert to be an important conceptual component of justice, others have argued against this view. Some argue that the concept of desert itself is problematic. This is known as the metaphysical argument against desert. Others claim that, even if desert is a defensible concept, determining what people deserve or treating people according to what they deserve is not feasible. These ideas are defended in the epistemological and pragmatic arguments against desert. Some maintain that, regardless of the force of the metaphysical, epistemological, or pragmatic arguments, desert does not have a prominent role in distributive justice. Examples of this view can be found in right- and left-libertarian theories of justice.

a. Rawls’s Metaphysical Argument

Among the contemporary theories of justice in which desert does not have a prominent role, John Rawls’s is the most often discussed. Drawing from Herbert Spiegelberg’s (1944, 113) idea that the inequalities of birth are types of underserved discrimination, Rawls (1971, 104) claims that desert does not apply to one’s place in the distribution of native endowments, one’s initial starting place in society, i.e. the familial and social circumstances into which one is born, or to the superior character that enables one to put forth the effort to develop one’s abilities. As is often the case with Rawls’s work, as evidenced by the discussion of pre-institutional and prejusticial desert above, there are many competing interpretations of his views on the relationship between desert and justice. Yet, regardless of which of these interpretations is correct, Rawls work suggests a metaphysical argument against desert. According to this metaphysical argument, since most of who we are and what we do is greatly influenced by undeserved native endowments and by the undeserved circumstances into which we are born, one cannot deserve anything, or, at best, one can deserve very little. According to a common interpretation, Rawls believes that desert should not have any role in distributive justice, since these undeserved factors have a major influence on all would-be desert bases (Sher 1987, 22 ff). Others contend that Rawls does allow for some limited amount of desert (Moriarty 2002, 136-137). Regardless of whether Rawls does allow for some limited amount of desert, if sound, the metaphysical argument against desert would either substantially or completely undermine the concept.

b. The Epistemological and Pragmatic Arguments

David Hume was an early critic of those theories of distributive justice in which merit was assigned a prominent role. Although, as discussed above, there are differences between the concepts of desert and merit, and although Hume’s use of the term ‘merit’ differs from more modern uses, the kinds of arguments that Hume offered against merit are often used against desert in contemporary discussions. Hume argued that since humans are both fallible in their knowledge of the factors that would establish others’ merit and prone to overestimating their own merit, distributive schemes based on merit could not result in determinate rules of conduct and would be utterly destructive to society (Hume, 27). This thinking is captured in the epistemological and pragmatic arguments against desert. According to the epistemological argument, since we cannot know the specific details of the lives of every member in a community or society, we cannot accurately treat people according to their desert. Recall that effort and performance are commonly cited as appropriate desert bases. Even if one agrees that only effort and performance should be used to determine one’s desert, concerns about how such determinations could be made with any accuracy or consistency still remain. How could one know how much of a person’s performance was the result of effort as opposed to natural talent, brute luck, or any other number of complicating factors? The pragmatic argument against desert is that, regardless of whether we could gain the knowledge needed to treat people according to their desert accurately, attempting to do so would have overriding negative consequences. Such negative consequences could include expending large amounts of time and resources in an effort to make accurate desert judgments and, perhaps, losses of personal privacy as one delves into the details of others’ lives. Both the epistemological and pragmatic arguments must be accounted for when attempting to explain how a true meritocracy could and should be arranged. Those who do not advocate meritocracies on a large scale might overcome the difficulties suggested by the epistemological and pragmatic arguments by maintaining that the use of desert should be limited to smaller, local contexts. According to this view, since it is easier to determine a person’s desert in contexts that are limited in size and scope, accurate desert judgments would be both possible and feasible in such contexts.

c. Libertarian Arguments

According to Libertarianism, each individual agent fully owns himself. As a full self-owner, the agent is entitled to use his various abilities to acquire property rights in the world. For the libertarian, the primary goal of justice is the protection of negative liberty. Based on a principle of non-interference, negative liberty is understood as the absence of constraints on an individual’s actions. Some mark a distinction between right-libertarianism and left-libertarianism. Perhaps the most well-known explication of right-libertarianism, which is often understood as the traditional version of libertarianism, is given by Robert Nozick in Anarchy, State, and Utopia. Nozick advances an entitlement theory of justice. On this view, a just distribution is one in which each person is entitled to the holdings that she possesses according to the principles of justice in acquisition, transfer, and rectification. Nozick describes his entitlement theory as “historical,” because it determines the justice of holdings on the basis of how those holdings came to be held, and “unpatterned,” because the justice of holdings is not determined on the basis of some additional normative criteria, such as merit, need, or effort (1974, 155 ff). Because meritocracies are patterned, Nozick would reject them. Right-libertarians would be concerned with liberty-restricting attempts at distributing or redistributing resources according to prevailing conceptions of merit or desert. Therefore, the concept of desert does not have a major role in their theories of justice. Libertarians need not reject the concept of desert entirely, however. And Nozick offers various arguments against Rawls’s rejection of desert (1974, 215 ff). For the right-libertarian, desert could be a concept for the individual to consider in his personal decision-making processes, but not one that the state should use to try to guide allocations or distributions of resources. As with right-libertarianism, left-libertarianism is based on the idea that each individual agent fully owns himself. But the left-libertarian view about the appropriation of natural resources differs greatly from the right-libertarian view. Left-libertarians believe in the egalitarian ownership of natural resources. Anyone who appropriates a natural resource would have to pay others for the value of that resource. Such a payment might then be placed into a social fund, from which distributions to other members of a society are made. The resources are divided according to egalitarian principles and not on the basis of merit or desert. The rejection of desert as a basis of distribution could be based on the metaphysical argument that, strictly speaking, people do not deserve anything. Or, a left-libertarian could recognize desert as a distributive concept, but one that is less important than equality. According to such a view, equality, and not desert, should be the primary basis of distribution within a society.

6. Concluding Remarks

Despite its use in daily life, desert is a concept that remains somewhat nebulous. Regardless of certain areas of disagreement, those who recognize desert as an important normative concept generally agree on a number of issues regarding the nature of desert. One point of general agreement is that desert consists of, at least, three main parts – a subject, a mode of treatment, and a desert base. In addition, scholars generally argue in favor of the view that desert is applicable to human beings, or at least some subset of them. Lastly, scholars generally agree that understanding the nature of desert is important to understanding the nature of justice.

7. References and Further Reading

Aristotle. Nicomachean Ethics. 2^nd Ed. Translated, with an Introduction, by Terence Irwin. Indianapolis: Hackett, 1999.
- An accessible translation that also includes detailed notes and a glossary.
Boylan, Michael. A Just Society. Lanham, MD: Rowan & Littlefield, 2004.
- Presents a worldview theory of ethics and social philosophy.
Celello, Peter. “Against Desert as a Forward-Looking Concept.” Journal of Applied Philosophy 26, no.2 (May 2009): 144-159.
- Argues that desert should be understood as a strictly backward-looking concept.
Cummiskey, David. “Desert and Entitlement: A Rawlsian Consequentialist Account.” Analysis, 47, no. 1 (Jan., 1987): 15-19.
- Advances an institution-dependent account of desert.
Daniels, Norman. “Merit and Meritocracy.” Philosophy and Public Affairs, 7, no. 3 (1978): 206-233.
- A discussion of meritocracy, and the meriting of both jobs and the rewards attached to those jobs.
Feinberg, Joel. Doing and Deserving: Essay in the Theory of Responsibility. Princeton: PrincetonUniversity Press, 1970.
- A collection of previously published essays, and previously unpublished lectures, focused on issues surrounding the harm and benefit of others.
Feldman, Fred. “Desert: Reconsideration of Some Received Wisdom.” Mind, New Series 104, no. 413 (January 1995): 63-77.
- Argues against the ideas that desert must be backward-looking and that desert requires responsibility.
Feldman, Fred. “Responsibility as a Condition for Desert.” Mind, New Series 105, no. 417 (January 1996): 165-68.
- A reply to Smilansky’s “The Connection between Responsibility and Desert: The Crucial Distinction,” in which Feldman argues that Smilansky’s solution to maintaining a connection between desert and responsibility fails.
Hume, David. An Enquiry Concerning the Principles of Morals. Edited by J. B. Schneewind. Indianapolis, IN: Hackett, 1983.
- A presentation of Hume’s moral philosophy in which he develops ideas from Book III of A Treatise of Human Nature.
Kekes, John. Against Liberalism. Ithaca, NY: CornellUniversity Press, 1997.
- A sustained criticism of political liberalism, which includes a defense of the view that justice should be understood to combine desert and consistency.
Kristjánsson, Kristján. “Justice, Desert, and Virtue Revisited.” Social Theory and Practice 29, no. 1 (January 2003): 39-63.
- Argues that the sole basis for desert is moral virtue.
McLeod, Owen. “Contemporary Interpretations of Desert: Introduction.” In Pojman and McLeod, eds., (1999a): 61-69.
- A brief essay about desert, its bases, and its relation to other concepts.
McLeod, Owen. “Desert and Institutions.” In Pojman and McLeod, eds., (1999b): 186-95.
- Argues that some desert is institutional and some is preinstitutional.
Mill, John Stuart. Utilitarianism. 2^nd ed. Edited by George Sher. Indianapolis: Hackett, 2001.
- Mill’s highly influential explication of the normative ethical theory of utilitarianism.
Miller, David. Principles of Social Justice. Cambridge, MA: HarvardUniversity Press, 1999.
- A theory of social justice that includes detailed treatments of the concept of desert and its role in justice.
Miller, David. Social Justice. Oxford: OxfordUniversity Press, 1976.
- A work on social justice, including a chapter devoted to desert.
Moriarty, Jeffrey. “Against the Asymmetry of Desert.” Nous 37, no. 3 (2003): 518–536.
- Argues against the view that desert can have an important role in retributive justice, while not having an important role in distributive justice.
Moriarty, Jeffrey. “Desert and Distributive Justice in A Theory of Justice.” Journal of Social Philosophy 33, no. 1 (Spring 2002): 131-43.
- Argues that John Rawls recognizes pre-institutional desert and that Rawls’s failure to consider such desert in his theory of justice seems unjust.
Nozick, Robert. Anarchy, State, and Utopia. New York: Basic Books, 1974.
- An influential defense of libertarian principles.
Plato. Laws. Translated by Trevor J. Saunders. In Plato: Complete Works, edited by John Cooper. Indianapolis: Hackett, 1997.
Plato. Republic. Translated by G. M. A. Grube. Revised by C. D. C. Reeve. In Plato: Complete Works.
- The Complete Works contains recent translations of all of Plato’s works, dubia, and spuria.
Pojman, Louis. “Equality and Desert.” Philosophy, 72, no. 282 (Oct. 1997): 549-570.
- Argues that the underlying justification of punishment and reward is desert or merit.
Pojman, Louis. Justice. Upper Saddle River, NJ: Pearson, 2006.
- An accessible introduction to different theories of justice, which includes a chapter on justice as desert.
Pojman, Louis, and Owen McLeod, eds. What Do We Deserve?: A Reader on Justice and Desert. New York: OxfordUniversity Press, 1999.
- Contains selections from many influential works on desert and its role in justice.
Rachels, James. “What People Deserve.” In Can Ethics Provide Answers?: And Other Essays in Moral Philosophy, 175-97. Lanham, MD: Rowman and Littlefield, 1997.
- A chapter on desert, which includes a discussion of the relationship between desert and responsibility and a discussion of desert’s temporal orientation.
Rawls, John. A Theory of Justice. Cambridge, MA: HarvardUniversity Press, 1971.
- Rawls’s seminal work in which he advances a theory of justice as fairness.
Scheffler, Samuel. “Justice and Desert in Liberal Theory.” California Law Review 88 (May 2000): 965-90.
- Discusses Rawls’s view on the asymmetry between desert’s role in distributive and retributive justice, and argues that Rawls rejects prejusticial, but not pre-institutional desert.
Schmidtz, David. Elements of Justice. Cambridge: CambridgeUniversity Press, 2006.
- Argues for a pluralist theory of justice based on principles of equality, desert, need, and reciprocity.
Schmidtz, David. “How to Deserve.” Political Theory 30, no. 6 (December 2002): 774-99.
- Includes a “promissory account” of desert, which has forward-looking aspects.
Sher, George. Desert. Princeton: PrincetonUniversity Press, 1987.
- A detailed examination of desert and its role in justice.
Sidgwick, Henry. The Methods of Ethics. 7^th ed. London: Macmillan, 1907.
- His seminal work in which he discusses egoism, intuitional morality, and utilitarianism.
Smilansky, Saul. “The Connection between Responsibility and Desert: The Crucial Distinction.” Mind, New Series 105, no. 419 (July 1996a): 485-86.
- A reply to Feldman’s “Desert: Reconsideration of Some Received Wisdom,” in which Smilansky argues that there is a connection between desert and responsibility.
Smilansky, Saul. “Control, Desert, and the Difference between Distributive and Retributive Justice. Philosophical Studies, 131(3) (2006): 511–524.
- Provides a defense of the asymmetry between desert’s role in distributive and retributive justice.
Smilansky, Saul. “Responsibility and Desert: Defending the Connection.” Mind, New Series 105, no. 417 (January 1996b): 157-63.
- A reply to Feldman in which Smilansky argues for a distinction between positive and negative responsibility conditions for desert.
Spiegelberg, Herbert. “A Defense of Human Equality.” Philosophical Review 53, no. 2 (1944): 101-24.
- Defends an ethical principle of human equality, and a view of justice based on that principle.

Author Information

Peter Celello
Email: celello.3@osu.edu
Ohio State University Newark
U. S. A.

American Wilderness Philosophy

Wilderness has been defined in diverse ways, but most famously in the Wilderness Act of 1964, which describes it “in contrast with those areas where man and his own works dominate the landscape … as an area where the earth and its community of life are untrammeled by man, where man himself is a visitor who does not remain.” The idea of wilderness has played a curious and crucial role in American culture generally, and especially in the rise of American environmentalism. Conquering wilderness was central to colonial and pioneer narratives of progress. Reverence and nostalgia for wilderness became tangled with American nationalism at the end of the 19^th century, with the end of the frontier. The passage of the Wilderness Act was an historically important event in American environmental politics, which tied the fate of much of America’s public lands to disputes over the meaning of wilderness. Since then, critics both international and domestic, but mostly from within the environmental movement, have criticized the idea of wilderness. Not that preserving or protecting natural places is a bad idea, rather they argue that thinking about nature in terms of wilderness obscures important issues and leads to bad decisions.

Etymology
Historical Attitudes
1. Sources of Antipathy
2. Sources of Appreciation
Wilderness Preservation: Major Figures
The Wilderness Act
Critical Scholarship
References and Further Reading

1. Etymology

The etymology, or history of a word, is sometimes offered as though the roots revealed the word’s correct, present meaning. This is a misunderstanding, as the meaning of a word changes over time and may end up far from its original use. However, an etymology may provide important clues into the biography of an idea and may have rhetorical significance when the meaning of a word is contested. Both of these are true of the etymology of wilderness. A rough summary of the roots of wilderness is a place essentially characterized by wild animals. The oldest and central root in this word is wild. It is present in Common Germanic, and is found in Old English as wilde, with surviving instances from c.725 as an adjective for plants and animals that were not tamed or domesticated and applied similarly to places by c.893. The Oxford English Dictionary gives its probable origin as the pre-Germanic ghweltijos, with a possible parallel in the root of the Latin and Greek words for wild beast.

An alternate and apparently mistaken origin of wild often given in the wilderness literature, repeated in Thoreau’s journals and given by Roderick Nash for instance, is that it is the past participle of will (Nash 2014). Wilderness is understood to be self-willed land, not subjected to the will of a domesticator or cultivator. The resonance of the idea is strong, but unfortunately the Old English willian, the root of will, has no clear connection to wilde. One upshot of rejecting this interpretation is that wild is first a word for plants and animals, later applied by analogy to people, and not vice versa as Nash reports.

The next piece in the etymology is the Common Germanic word for beast, found in Old English as deor. This was combined with wilde to form wilddeor, “wild animal,” with instances known from c.825. The “(d)er” which separates wilderness from wildness, is the root of our modern word for deer. In Old English, this was combined with the suffix –en, to make the adjective wilddeoren, which became wildern in Middle English, and was used to describe places. The –en suffix generally denotes what something is made of, as in “wooden” and “earthen,” so a wildern place is one made of wilddeor, of wild beasts. To this is joined the suffix –ness in an unusually concrete sense to form wilderness..

The centrality of wild animals in the etymology is important. Wilderness points not only to the absence of human culture in the landscape but to the presence of that which is often incompatible with it. When the wolves and the bears flourish, the domestic livestock are in danger, and people fear to walk at night. And wild beasts are easily displaced by human activity and presence. Aldo Leopold calls the crane “wildness incarnate” because of its love of solitude (1949). Nash draws out this connection to animals when he interprets the etymology as “the place of wild beasts” (1970). “If wildlife is removed,” he writes, “although everything else remains visibly the same, the intensity of the sense of wilderness is diminished” (Nash 1970). He cites Thoreau’s delight in the New England Lynx, Theodore Roosevelt’s equating wilderness with big game ranges and Leopold’s discussion of the last Grizzly on Escudilla. Leopold often treats particular species as defining the character of the places they dwell.

2. Historical Attitudes

A history of conflicted attitudes towards wild places and nonhuman nature goes much further back than the roots of the word wilderness. Many languages have no equivalent word to wilderness, but still they have managed sophisticated literature on the question. Both the beauty and the inhospitality of wild nature, and humanity’s ambiguous relationship to it, are common themes going back to the very oldest preserved literature.

In telling the history of attitudes toward wild nature, there are two opposite errors of oversimplification to avoid. On the one hand, some treat the modern American and romantic elevation of wilderness as something entirely new, contrasting with previous expressions of antipathy toward wild nature. Roderick Nash (2014) leans in this direction when he says wilderness began “as the unrecognized and unnamed environmental norm for most of Earth’s history, created as a concept by civilization, thereafter widely hated and feared, and quite recently and remarkably, appreciated.” On the other hand, one might find romantic sounding passages of wilderness appreciation in diverse ancient texts, whether the Epic of Gilgamesh, the Vedas or the Psalms, and conclude that there is nothing particularly new or interesting about the American idea. The more interesting historical questions are the more nuanced considerations concerning how and why wilderness is valued or shunned across times and cultures.

a. Sources of Antipathy

While there was no universal hatred or fear of wild nature in the ancient world, at least not to the exclusion of a great deal of appreciation, there was a remarkable degree of denigration of wild nature, reaching something of a climax in early modern Europe. Romanticism was in part a reaction against this, and the ideas that lead to it, and modern wilderness appreciation and preservation took root in the soil of romanticism. The origins of that hostility are variously attributed to the Jewish and Christian scriptures, Greek and Roman philosophy, the scientific and industrial revolutions, or some combination of these.

Clear claims of anthropocentrism, of the relative worthlessness and proper subjugation of wild nature, are frequently found in ancient Greek and Roman philosophers. Here, rationality is established both as the substance of dignity and worth and as the dividing line between the human and the nonhuman (as well as marking the proper hierarchies between some humans and others). Plato, in the voice of Socrates, makes clear his limited estimation of the value of wild things in the Phaedrus (section 230d) when he writes, “I am devoted to learning; landscapes and trees have nothing to teach me—only the people in the city can do that.” Aristotle shows a much greater inclination to appreciate and study wild nature, but he makes clear its subjugation and secondary value: nature making nothing in vain means that it all must exist for the sake of man (Politics 1256b7-22). Chrysippus agrees, finding it absurd to think that the world could have been made for the plants, or the irrational animals (cited in Coates 1998). The Roman philosopher Lucretius describes the presence of forests, mountains and wild beasts on the earth as a serious defect, taking heart that “these regions it is generally in our power to shun” (cited in Nash 2014). This is not to say that there were no elements of appreciation for wild nature in Greek or Roman society or letters, for that is not the case. But there was a clearly articulated and enduring view which implied wild nature was essentially wasted space.

Many commentators, including Nash, have followed Lynn White’s lead in pointing to theism and the Jewish and Christian scriptures as the source of antipathy toward wild nature (White 1967). These scriptures had a formative influence on modern attitudes toward wilderness because of the prominent use of the word in English translations of the Bible. Spiritual connotations, especially from the Exodus account of the Israelites wandering in the wilderness for forty years, were laid onto the word, as well as new physical associations with arid and desert landscapes. The meaning of these spiritual connotations is complex, as wilderness is at once a place of divine revelation as well as temptation and punishment. The Bible does not clearly convey an overarching attitude of fear or hatred of the wild. Genesis 1 repeatedly declares the goodness of everything, prior to the creation of humans. The Psalms celebrate both the useless parts of nature, such as rock badgers, as well as the dangerous, such as lions, as independently glorifying to God (Psalm 104). Animals, both wild and domestic, plants and even soil are given protections in the Mosaic Law (for example, Exodus 23:10-11; Deuteronomy 20:19-20, 22:6, 25:4), and God is described as making covenant with the Earth and all its creatures (Genesis 9). Even the often cited passage giving people dominion over the other animals, does not clearly put them at human disposal, for it manifestly did not include permission to eat animals (Genesis 1:28-29; Genesis 9:3).

As Greco-Roman philosophy and Christian theology increasingly joined together in medieval and modern European intellectual culture, the ideas of Plato and Aristotle were given new expression in biblical and theological language. Rationality is privileged by Aquinas in this combined way, for instance, arguing that only the rational creatures can know and love God and thereby fulfill the purpose of creation (Summa Contra Gentiles c.1270). The enlightenment and scientific revolution included a great revival of interest in Greek and Roman philosophy, and serious interest in nature was focused onto the search for universal, mathematical laws. Francis Bacon’s writings in the early 17^th century established a lasting connection between the idea of dominion in Genesis and the project of scientific-technological mastery over nature. The metaphor of nature as machine came to dominate. Descartes argued that, lacking rationality, non-human animals should not be supposed to have souls or consciousness at all, but are mere automata, to be freely experimented upon (Discourse on Method 1637). As the scientific project bore fruit in the industrial revolution, the dominant view of wild nature was as disordered material which could be brought into rational order through science and labor, and thus serve its ultimate purpose of existing for the benefit of mankind. This view is clearly expressed in John Locke’s influential labor-theory of property, which justifies the human worker’s property rights over nature on the basis of nature having little to no value before the worker’s labor was mixed with it (Second Treatise on Government 1689).

The Lockean attitude toward wilderness as waste is clearly evident among the early American colonists. For instance, the Puritan John Winthrop gave as a reason for going to America that it would be wrong to let a whole continent lie waste (Nash 2014). Justification for displacing indigenous people was often asserted on the basis that they had not worked it, or at least not rationally. And the attitude continued to dominate well into the settlement of the west. Alexis de Tocqueville complained upon visiting America in the 1830s that Americans could only see their wilderness as an obstacle to progress (cited in Nash 2014). During the time of the exploration, colonization and settlement of the North America by the Europeans, the idea that the less rational parts of nature existed for the sake of the more rational was thoroughly entrenched. And wilderness especially had to be transformed by labor to fulfill that purpose.

b. Sources of Appreciation

The scientific revolution also produced a contrary attitude towards nonhuman nature, however, best expressed in a group known as the physic-theologians. Writers such as John Ray (1627-1705) found in wild nature, not the absence of rationality, but the rational design of God, worthy of study and contemplation. Indeed, studying wild nature was thought to be an especially important path to understanding God, since only wild nature was unaffected by the fall and sin of mankind. Physico-theology contributed to the rise and influence of natural history, an approach to science that in turn deeply informed the wilderness preservation movement.

The practice of natural history flourished in America in the 18^th and 19^th centuries and was characterized by the description, collection and classification of natural specimens and objects. The fondness of European aristocrats and intellectuals for natural curiosities from around the world made natural history a singular way for colonists to stay connected to the social and intellectual affairs of Europe. The travel and work of natural historians was thus often tangled with the broader European projects of exploration and conquest, and the naturalists, who frequently found themselves caring for what was being destroyed, often expressed significant concern about this connection. Natural historians were largely generalists, writing about nature as a comprehensive whole, and often organized in local, amateur, natural history societies (Smallwood 1967). Some like Alexander von Humboldt, were well connected members of European society who travelled over much of the world, while others like John and William Bartram and John James Audubon were from the colonies and travelled only regionally. Artistic and literary abilities were crucial for their success, and the travel narratives of naturalists became a popular literary genre, where some of the earliest and strongest positive evaluations of wild nature found their greatest audiences.

Romanticism, a multifaceted cultural trend and backlash against the scientific and industrial revolutions, brought not just an acceptance but an enthusiastic veneration of wild nature and wilderness to cultural prominence. Romanticism had strong connections to the natural history tradition: William Wordsworth and Samuel Coleridge were readers of William Bartram (Smallwood 1967), and Alexander von Humboldt was closely associated with Goethe. But romanticism’s influence on wilderness appreciation comprised much more than its further endorsement of natural history as a significant mode of science. Romanticism treated aesthetic responses to nature as just as important as nature’s quantifiable properties, and developed a robust conception of the sublime. Romantic trends in literature and painting, especially the Hudson River school, produced many powerful, positive portrayals of wilderness. Suspecting that modern industrial society corrupts people rather than cultivates them, romanticism also endorsed primitivism and the pursuit of frequent solitude in nature.

Another aspect of romanticism that was important for the rise of wilderness preservation, was its emphasis on nationalism. America’s great wilderness became a point of pride and national identity, something that set it apart from Europe. The historian Frederick Jackson Turner argued that several aspects of the American character, from self-reliance to a democratic spirit, were products of the American frontier experience (1921). And he worried that the continuation of the American national distinctiveness was jeopardized by the end of the frontier, which was formally declared in the 1890 census. Frontier nostalgia drove a lot of early preservation work, as well as related phenomena, particularly the scouting movement and recreational hunting.

America also saw the development of a distinctive form of the romantic movement known as American transcendentalism. Ralph Waldo Emerson’s Nature, a seminal text for transcendentalism, explores the importance of solitude, the beauty of nature and the significance for both of these for understanding God. Emerson’s influence on Henry David Thoreau, and his long relationship with him, plants the roots of the American wilderness preservation movement firmly in transcendentalism. For Thoreau is the first major figure and intellectual of the wilderness tradition.

Another important factor in in the growing appreciation of wilderness was America’s early experience with extensive deforestation. Among the many who bemoaned this loss, none articulated the problem for the public more clearly and effectively than George Perkins Marsh. His 1864 Man and Nature first clearly indicted deforestation for its effects on soil and water. Marsh refuted the naïve optimism of the day, concerning the beneficial effects of all human labor on nature, and outlined rather the devastating, unintended harms caused by inappropriate uses of land. The economically practical case he provided for the conservation of forests and general care for the land provided an important complement to the aesthetic and spiritual emphasis of the romantics.

3. Wilderness Preservation: Major Figures

Expressions of wilderness appreciation multiplied quickly in the late 19^th and early 20^th century, and many people made distinctive contributions in art, literature, science and policy. A few major figures, however, laid out distinctive visions which guided the course of wilderness preservation, and which contemporary scholars tend to treat as the defining core of the tradition.

a. Henry David Thoreau

Thoreau’s work develops many of the romantic themes towards nature. Especially in Walden, he is concerned with the degrading influence of too much society, commerce and industry and with the salutary effects of nature’s company. He was a frequent canoe traveler and mountaineer, and developed a daily habit of extensive hiking. Both Walden and his travel writings argue for the existence of deeper meanings and higher uses in nature than as mere material for the human economy. He found the aesthetic value of nature to be spiritually and morally important, and woefully underappreciated. But he also spoke of a broader point view, which sees the weeds as food for the birds and the squirrels as planters of the forest. Recognizing that nature, often in the very places it is widely despised, has hidden and indirect values, he anticipates the contemporary economic idea of ecosystem services.

After his stay at Walden Pond, Thoreau turned his energies increasingly to natural history, particularly in the mode of Humboldt. He expressed some concern about the possibility of a purely scientific disenchanting nature and dulling of the imagination. But he was committed to cultivating the greatest awareness of nature as possible and to fully appreciating the value of facts, refusing to reduce appearances to the merely symbolic as Emerson had tended to. He kept careful records of plant and animal distribution and phenology, which have proven valuable for current climate science, and made seminal contributions to the understanding of forest succession and seed distribution. Unfortunately Thoreau’s early death left many of these projects unfinished and unpublished, although most are now available. His extensive journals, influential works in their own right, show a rich blending of this careful attention to natural history with the poetic and philosophical insight.

The essay Walking, revised and reworked until the end of his life, is particularly significant for wilderness thought. In this essay he treats wildness as the highest ideal of ethics and aesthetics and defends the view that both land and people need a balance of the cultivated and the wild, albeit sharply tilted toward the wild. In this work appears his oft-quoted dictum that “In wildness is the preservation of the world.” Max Oelschlaeger points to Thoreau’s lament for pine trees reduced to mere lumber as the earliest and clearest statement of a preservationist’s credo: “Every creature is better alive than dead, men and moose and pine trees, and he who understands it aright will rather preserve its life than destroy it” (cited in Oelschlaeger 1991). Other late works, such as Huckleberries, progress from his early radical valuations of nature to clear preservationist policy arguments for parks, greenways and protected areas.

Considered a minor figure at first, then highly esteemed in American literature and political thought, Thoreau’s philosophical contributions—not only to environmental philosophy but also epistemology, philosophy of science and ethics—received increasing attention in the early 21^st century.

b. John Muir

The Muir family emigrated from Scotland when Muir was a young boy, as his father sought the opportunity to live his Campbellite faith more authentically. Muir’s childhood was saturated with an evangelical Biblicism and the poetry of Robert Burns, the Scottish romantic. His experience as a frontier farmer was largely negative, as he was sorely abused by his father for hard labor. Thanks in part to his genius for mechanics and invention, he found his way to the University of Wisconsin in Madison where he found an enthusiasm for botany. He also encountered transcendentalism and a romantic, nature-centered spirituality, which at first supplemented and then gradually transformed his evangelical faith. There is substantial debate on if and when he might be considered a pantheist. What is clear is that Muir’s wilderness philosophy is often expressed in much more intensely religious language than Thoreau’s, and is frequently wrapped in biblical metaphor.

Frequently a solitary traveler in the wilderness himself, he often focused on the potential of wilderness and of nature study for personal and spiritual transformation. His prescription for overworked and materialistic America was a conversion, a baptism in mountain beauty and reconciliation to wild nature. Muir found nature to be not only sublime and beautiful but earnestly benevolent. Even what appears harsh and destructive in nature, such as glaciation (a process on which he became a significant expert), should be seen as part of the ongoing, loving, creative process. Like Thoreau, Muir found tame and domestic plants and animals to be generally degraded versions of their wild counterparts, and he sometimes spoke in terms of the rights of nonhuman nature.

Muir’s increasing political significance grew out of his personal involvement with Yosemite, and its gradual progress toward becoming a national park. He became convinced that federal ownership was the only way that such exceptional places could be preserved from destruction. While God had preserved California’s giant trees through the ages, he wrote, only Uncle Sam could protect them from fools (1901). His eloquent writing on behalf of national parks and preservation made him a figurehead for the movement, a role which was formalized with the formation of the Sierra Club with him as charter president.

Early in the 20^th century, the movement for conservation on public lands began to fracture. Muir came to represent one end of a spectrum on how much and what sort of economic uses should be present in the federal reserves. Muir’s emphasis on the spiritual and aesthetic values of wilderness clashed with the progressive, utilitarian vision of Gifford Pinchot, who was more concerned that the nation’s resources should be developed efficiently for the public good, protected from shortsighted exploitation for private enrichment. The proposed and eventual damming of Hetch Hetchy Valley, within Yosemite National Park, for municipal water and power, brought this tension to bitter conflict during Muir’s later years. Muir was not opposed to productive work in nature, nor the human transformation of it in many places. He spent many profitable years working in sawmills and later managing a vineyard. But beauty, he held, is as much a need as bread or water is, and our physical needs can be met without destroying our most beautiful scenery. Just as timber can be had without cutting the redwoods, water could be had without flooding a national park. Muir saw the problem as one of greed for profit unconstrained by higher sensibilities.

c. Aldo Leopold

Aldo Leopold made significant contributions to both wilderness philosophy and policy. An avid naturalist and outdoorsman, Leopold worked within the new forest service to enhance recreation and hunting opportunities. He developed and established the scientific practice of game management. He was constant in his advocacy of a thoughtful and informed stewardship of nature, but his early confidence in the possibility and value of scientific manipulation the land for increased timber and game production was heavily tempered in his mature work.

Leopold’s major policy contribution was to push for a separate classification of land within the national forests, to be kept as roadless wilderness—a clear precursor to the Wilderness Act. Leopold, and those who followed his lead, such as Bob Marshall and the other founders of the Wilderness Society, were responding to the rise of the automobile, which Muir had not so much appreciated as a threat to wilderness. Touring and camping by automobile was growing rapidly, and the parks and forest recreation areas were filling with the roads and hotels to accommodate them. Leopold sought to protect some areas from this sort of development, first for those who wished to pursue more primitive types of recreation, including travel by canoe and pack train, and seekers of solitude, and then later for the protection of land and wildlife.

Philosophically, Leopold integrated wilderness appreciation with the maturing science of ecology, developed new arguments for preserving wilderness and articulated a moral vision for human relations to nonhuman nature, which he called the land ethic. From ecology, Leopold took a much more detailed picture of the land as an interdependent system of plants, animals, soils and natural processes—a biotic community. Understanding the land as a functionally integrated entity means that the land can be healthy or sick, analogously to an organism. Nutrients can be retained in cycles or lost; soils can be accumulated or depleted; species can persist or become extinct. Only healthy land has the capacity to replenish itself when disturbed. And since the workings of the land mechanism are beyond a full human understanding, an attitude of caution is warranted. Removing predators (the standard practice when he began his forestry career) could lead to disastrous consequences for soils and plants, a lesson he learned from personal experience.

Leopold developed the recreation argument for wilderness along several lines. Against charges of elitism, that big wilderness served the small minority with the strength and leisure time for it, he held that minority interests are worthy of protection. There is no danger of insufficient places for the more popular auto tourism, and public lands should not all be devoted to one kind of recreation. Camping and woodcraft are not only an idle nostalgia for our frontier past, they are a moral improvement upon it, directing old instincts to higher ends. He likened this change to the way football is an improvement over war; the transformation to sport preserved the best parts of the older practice without the downsides.

In later works, Leopold increasingly emphasized the value of wilderness for science. Wilderness is not the only healthy land, some traditional agricultural landscapes have showed long-term resilience, but it provides crucial examples of biotic communities that have functioned well over long time spans. Ecologists need wilderness the way doctors need healthy bodies to study. His own restoration of a worn-out farm demonstrated the practical value of this kind of ecological knowledge. Wilderness is also an important refuge for preserving wildlife, especially the large predators generally eliminated in other places. The arguments from science and wildlife are not entirely separate from the recreation argument, as Leopold suggests that wildlife study is one of the greatest forms of outdoor recreation.

The land ethic grew out of Leopold’s conviction that only a change in our ethical attitude toward the land could prevent us from spoiling it. Such a change he thought was not only possible but underway. The care people naturally feel toward their community and their neighbor can be extended to the land, for ecology clearly shows that the land is a community to which we belong. The recognition that we are plain members and citizens of that community supports the restraint and forbearance that is necessary to live in harmony with the land. Preserving the “integrity, stability and beauty of the biotic community” should limit our use of the land, as surely as economic feasibility does.

Leopold’s land ethic has been heralded as the first ecocentric ethic, an approach finally adequate to our environmental problems. It has also been criticized as offering a fascist justification for overriding individual rights in the interest of the community (Tom Regan, cited in Callicott 1987). Its lineage has also been debated: whether it is based on Darwin’s use of Hume’s ethics (Callicott 1987), or if it has more in common with the pragmatism Leopold would have encountered at Yale (Norton 1988). Either way, Leopold’s respect for the biotic community and his vision of wilderness as an important use within federal lands profoundly shaped the future of environmental thought and the coming Wilderness Act.

4. The Wilderness Act

The National Wilderness Preservation System was created with the passage of the Wilderness Act in 1964. The Act did not create a separate agency, but designated and protected roadless areas within federal lands, whether managed by the Forest Service, National Park Service, Fish and Wildlife Service or the Bureau of Land Management. The Act provides for substantial public input on proposed listings and requires congressional action for land to be added or removed from the system. Similar to national parks, wilderness areas are required to be managed under a twin mandate, kept both for the “use and enjoyment” of the people and preserving their wilderness character unimpaired.

The Wilderness Act includes a poetic definition of wilderness, which has been the subject of much critical discussion:

A wilderness, in contrast with those areas where man and his own works dominate the landscape, is hereby recognized as an area where the earth and its community of life are untrammeled by man, where man himself is a visitor who does not remain. An area of wilderness is further defined to mean in this Act an area of undeveloped Federal land retaining its primeval character and influence, without permanent improvements or human habitation, which is protected and managed so as to preserve its natural conditions and which (1) generally appears to have been affected primarily by the forces of nature, with the imprint of man’s work substantially unnoticeable; (2) has outstanding opportunities for solitude or a primitive and unconfined type of recreation; (3) has at least five thousand acres of land or is of sufficient size as to make practicable its preservation and use in an unimpaired condition; and (4) may also contain ecological, geological, or other features of scientific, educational, scenic, or historical value.

Some of the definition’s notable features are the emphasis on the absence of human presence and impact, the language of degree and subjective appearance and the unusual word, “untrammeled.” Trammel is not a form of trample, and does not involve the idea of walking. It means to bind up, constrain or fetter, not simply touch or influence. Trammel can also be a noun, referring to a kind of fish net or to rope shackles tied on a horse’s legs to keep it from galloping.

Implementation of the Wilderness Act required some interpretive decisions. The Forest Service, generally seeking to maintain more flexible control over its lands, argued for a strict interpretation of wilderness, excluding any lands with a significant history of human impact. This came to be known as the purity policy. Others, including the Wilderness Society, the non-profit organization which had first pushed for the law and shepherded it through the years of debate before it finally passed, argued for a more flexible and pragmatic understanding of wilderness (Turner 2012). Rather than looking back at whether the land had suffered human impact, the question was whether it could be managed in a way that would render human impact substantially unnoticeable in the future (Woods 1998).

At stake in this question was both how big the wilderness system could be and whether there would be more than a few wilderness areas east of the Mississippi, where historic impacts were generally greater. The forward-looking approach championed by the Wilderness Society eventually triumphed with the 1975 designation of many eastern areas with significant past impacts, which has come to be called the Eastern Wilderness Act.

Another issue that came into the question of purity was how much wilderness should be protected from recreational overuse. Frontier nostalgia tended to a form of recreational woodcraft that was fairly high impact, with campers cutting boughs for beds and lean-tos, for instance. As outdoor recreation continued to increase in popularity through the 1960s and 70s, there was debate over whether wilderness and lands for recreation ought to be given separate designations, which would have resulted in far less wilderness areas. The dilemma was mitigated with a movement toward low-impact camping, culminating in the Leave No Trace program (Turner 2002). While vastly increasing the number of people who can camp in a wilderness area without spoiling it, the new methods have also introduced a greater dependence on consumer products and synthetic materials and reduced the need for knowledge of the natural history of the place.

Another test for the meaning of federal wilderness areas would come with the debates over public lands in Alaska, where vast roadless areas often contained indigenous peoples practicing subsistence lifestyles. In 1980, the Alaska National Interest Lands Conservation Act added 56 million acres to the National Wilderness Preservation System, more than doubling its size, but permitting many activities crucial to subsistence living not permitted in designated wilderness outside Alaska. Some motorized access and even log cabins, it was decided, do not pose the same threat to the “Earth and its community of life” in Alaska as they would in the more densely populated U.S. states.

5. Critical Scholarship

Wilderness preservation has often faced criticism and opposition in the political arena. The Sagebrush Rebellion was largely a reaction against the implementation of the Wilderness Act on western lands. Such conflict is often rooted in issues of public versus private property rights. The academic literature on wilderness has tended to focus on other issues—the history of the idea, its influence on policy, and whether it represents a reasonable or appropriate approach to nonhuman nature.

Roderick Nash’s 1967 book, Wilderness and the American Mind, was the seminal work for contemporary wilderness scholarship. It traced the history of the idea of wilderness from ancient attitudes toward nature through the passage of the Wilderness Act. Nash frames the story as the remarkable rise of appreciation for wilderness from the midst of long-standing antipathy. Though not without offering some criticism, the work is largely celebratory of the wilderness tradition and preservation movement and has had an enduring popularity with the backpackers and activists as well as a lasting influence on scholarship. Much of the wilderness scholarship subsequent to Nash’s work has essentially aimed to supplement or correct the general picture given in it.

The first in a series of criticisms and responses, that came to be known as the great new wilderness debate, came from Ramachandra Guha, an environmental and political historian from India (1989). Guha argued that the radical environmental movement in America had an unhealthy focus on biocentrism and wilderness, which are largely irrelevant to the problems he claims are at the root of the environmental crisis: overconsumption and militarization. Environmentalism in India has largely been a class struggle between the rural poor, who depend on the forests for their subsistence, and the over-consuming urban industrialists, which threaten to destroy the forests and poor alike. Western environmental organizations coming into India and working to establish wilderness-like reserves, such as the tiger reserves, are further displacing traditional subsistence economies to make playgrounds for the wealthy. Wilderness, according to Guha, was not appropriate in densely and long inhabited places like India.

William Cronon, an environmental historian, and J. Baird Callicott, an environmental philosopher, followed with arguments that there was something more deeply flawed about the idea of wilderness, even in North America (Cronon 1995; Callicott 1991). Unlike Guha, both insisted that they support protected areas; their problem was with a way of thinking. Wilderness is historically false, denying the long and extensive human influences on the North American landscape, and thus continuing the denial of the humanity of Native Americans. Wilderness thinking presupposes a pre-Darwinian dichotomy between people and nature by treating only people-less places as real or pristine nature. The result of this dualism is misanthropy and a tendency to see the removal of people as the solution to every environmental problem. Holding wilderness to be the ideal form of nature, they argued, is an obstacle to a responsible environmentalism, which must help us live in harmony with nature in the places we inhabit and work not just the places we visit and play in. Cronon in particular worried that caring for pristine nature far from home makes it easier to tolerate the abuse and destruction of mundane nature close to home. Wilderness thinking, they alleged, also tends to treat nature as static, seeking to preserve a place in a particular form, instead of recognizing the dynamic processes at play in nature.

More critics soon followed, drawing out the imperialism, colonialism or ethnocentrism latent in the preservation project. Many of the criticisms were clearly grounded. Frontier nostalgia requires a certain blindness to the perspectives of Native Americans, and western style parks have been implemented in Africa in ways that are brutal to the indigenous inhabitants. But many wilderness advocates found the criticisms to be unfair overall and not helpful to achieving the responsible environmentalism the critics claimed to desire. The Wilderness Act had not endorsed an ideal of pristine or untouched nature, and the Forest Service’s attempt to interpret it that way had been roundly defeated (Friskics 2008). And the experience in Alaska had showed that wilderness preservation need not be hostile to indigenous people or traditional subsistence cultures. It is not that the environmental movement in America has only sought wilderness preservation and not worked for reform in forestry, agriculture and industry; it is just that reform efforts have often been less successful and harder to accomplish than wilderness designation (Foreman 1998).

Val Plumwood gives a thorough analysis of the issue of dualism in the wilderness tradition, finding it in the frequent appellation, “virgin,” and the legal doctrine of terra nullius in the Australian outback (1998). But she also demonstrated how much of the tradition is open to a non-dualistic interpretation, treating the other of wilderness not as the mere absence of the human but as the presence of something else. The extensive concern with natural history in all the major figures of the wilderness tradition strongly supports this non-dualistic interpretation of wilderness as presence. And if wilderness is not simply the absence of human touch, then valuing and preserving it need not lead to misanthropy. People visiting but not remaining is not the essence of wilderness but a practical strategy for protecting what is essential to wilderness: the living, active presence of nonhuman nature, whether it be grizzly bears or giant trees.

Other responses have come from the new conservationists, a diverse alliance of wilderness activists and conservation biologists, which have pushed for a much more aggressive preservation strategy in the 90s and 2000s. The Wildlands Project, for example, proposed a map of wilderness areas, buffer zones and wildlife corridors that puts 50% of the contiguous US into some form of protected status. James Turner suggests that this more aggressive strategy precipitated the great new wilderness debate (2012). But the new conservationists, such as Reed Noss and Dave Foreman, are clear that their sense of wilderness is largely about securing the wildlife habitat necessary to mitigate the extinction crisis (Foreman 1995, 1998 and Noss 1991). Rather than looking for lands supposedly never touched by people, they seek to restore much land that is presently heavily trammeled and dominated by the works of man. And rather than seeing nature as static, their pursuit of bigger and bigger wilderness areas is driven by an increased understanding of landscape dynamics and of the population sizes needed for evolution to occur.

The legacy of wilderness in America thought and policy is complex, with some parts that have many opponents (for example, the erasure of indigenous cultures and histories) and some that have very wide appeal (for example, the national parks). The writings of Thoreau, Muir and Leopold have enriched and enchanted the lives of many Americans. The National Wilderness Preservation System has been remarkably successful at preserving large roadless areas, and many conservation biologists see an extension of this strategy as the best hope for protecting biodiversity. Others have found the cultural baggage of wilderness too great, and would prefer to take other strategies, hoping to better integrate the human economy with natural systems. Clearly wilderness preservation cannot solve all environmental problems, such as environmental injustice or climate change, but it may help with a lot of problems, even those.

6. References and Further Reading

Abbey, Edward. Desert Solitaire: A Season in the Wilderness. (New York: McGraw Hill, 1968).
- An influential articulation of a wilderness philosophy, this book was written after the Wilderness Act but early in the process of review and designation. It is deeply imbued with an appreciation of the desert southwest.
Bartram, William. Travels and Other Writings. Thomas P. Slaughter, ed. (New York: Library of America, 1996).
Bartram’s Travels, first published in 1791.
- His major literary work, representing natural history in a romantic mode and a literary genre of significant importance for the growing wilderness appreciation.
Bugbee, Henry. The Inward Morning: A Philosophical Exploration in Journal Form (Athens, Ga: University of Georgia Press, 1999). First published in 1958.
- A remarkable and beautiful use of wilderness for understanding reality and our place in it. Deep Thoreauvian reflections in dialogue with mid-20^th century philosophy.
Callicott, J. Baird. “The Conceptual Foundations of the Land Ethic.” Companion to A Sand County Almanac: Interpretive and Critical Essays. J. Baird Callicott, ed. (Madison: University of Wisconsin Press, 1987): 186-217.
Callicott, J. Baird. “The Wilderness Idea Revisited: The Sustainable Development Alternative” The Environmental Professional 13 (1991): 235-47. Reprinted in The Great New Wilderness Debate.
Callicott, J. Baird and Michael Nelson, eds. The Great New Wilderness Debate (Athens, GA: University of Georgia Press, 1998).
- A comprehensive collection of contemporary wilderness criticism, including a selection of important works from across the history of the wilderness tradition. It also includes several significant original pieces.
Callicott, J. Baird and Michael Nelson, eds. The Wilderness Debate Rages On: Continuing the Great New Wilderness Debate (Athens, GA: University of Georgia Press, 2008).
- A second large collection, this volume includes a lot of the critical scholarship on wilderness published since the first collection. It also covers some gaps in the previous volume, including important works by early 20^th century ecologists and more discussion of race and class.
Chipeniuk, Raymond. “The Old and Middle English Origins of ‘Wilderness.’” Environments 21(1991): 22-28.
Coates, Peter. Nature: Western Attitudes since Ancient Times (Berkeley: University of California Press, 1998).
- This book is especially helpful on Roman and Medieval times, often skipped over in other treatments, and it balances the history of ideas with the history of the environment, considering ancient impacts in some depth.
Cole, David N. and Laurie Yung, eds. Beyond Naturalness: Rethinking Park and Wilderness Stewardship in an Era of Rapid Change. 2^nd ed. (Washington, D.C.: Island Press, 2010).
- Diverse approaches to interpreting naturalness and wildness are considered in light of the practical management of protected areas and the challenges currently facing such management, including climate change and invasive species.
Cronon, William, ed. Uncommon Ground: Rethinking the Human Place in Nature. (New York: W. W. Norton & Company, 1995).
- This anthology is largely critical of the idea of wilderness and includes Cronon’s much discussed piece, “The Trouble with Wilderness, or, Getting Back to the Wrong Nature.” It includes several other worthwhile chapters as well, particularly Anne Spirn’s chapter on the legacy of Frederick Law Olmsted.
Emerson, Ralph Waldo. Nature (Boston: James Munroe & Company, 1836).
- Emerson’s classic is widely available in print and on the internet, including a scanned image of the 1836 original.
Friskics, Scott. “The Twofold Myth of Pristine Wilderness: Misreading the Wilderness Act in Terms of Purity” Environmental Ethics 30 (2008): 381-99.
Foreman, Dave. “Wilderness Areas for Real.” The Great New Wilderness Debate.. J. Baird Callicott and Michael Nelson, eds. (Athens, GA: University of Georgia Press, 1998): 395-407.
Foreman, Dave. “Wilderness: From Scenery to Nature” Wild Earth 5(4) (Winter 1995/96): 9-16. Reprinted in The Great New Wilderness Debate.
Guha, Ramachandra. “Radical American Environmentalism and Wilderness Preservation: A Third World Critique.” Environmental Ethics 11 (1989): 71-83. Reprinted in The Great New Wilderness Debate.
Harding, Walter. The Days of Henry Thoreau: A Biography. 2^nd ed. (Mineola, NY: Dover Publications, 2011).
- First published by Knopf in 1965, this biography has seen many printings. See also Richardson, 1988.
Hargrove, Eugene C. Foundations of Environmental Ethics (Denton: Environmental Ethics Books, 1996).
- First published in 1989, this work is valuable for its discussion of the history of property rights and their tension with preservation. It also defends the viability of aesthetic arguments for preservation and their connection to wildlife conservation.
Harvey, Mark. Wilderness Forever: Howard Zhaniser and the Path to the Wilderness Act (Seattle: University of Washington Press, 2005).
- Zhaniser was the primary author of the Wilderness Act and a driving force behind its eventual passage.
Leopold, Aldo. A Sand County Almanac and Sketches Here and There. Special Commemorative Edition (Oxford: Oxford University Press, 1987). First published in 1949.
- Aldo Leopold’s most influential work, accepted for publication just before his death. The last section of the book, called the “Upshot,” contains the most direct discussion of wilderness and the land ethic.
Leopold, Aldo. The River of the Mother of God and Other Essays. Susan L. Flader and J. Baird Callicott, eds. (Madison: University of Wisconsin Press, 1991).
- Many of Leopold’s other works, arranged chronologically, enabling the reader to see the development of his thought over time.
Lewis, Michael. American Wilderness: A New History (Oxford: Oxford University Press, 2007).
- An anthology covering diverse aspects of the history of wilderness and preservation in America, updating and complementing Nash’s work in several ways. For instance, it includes a chapter chronicling the extensive role of women and women’s clubs in the early preservation movement.
Lowenthal, David. George Perkins Marsh: Prophet of Conservation (Seattle: University of Washington Press, 2000).
- A scholarly biography situating Marsh’s life and work in relation to the early conservation movement.
Marsh, George Perkins. Man and Nature; or, Physical Geography as Modified by Human Action (New York: Charles Scribner, 1864).
- Immensely influential on the beginnings of the conservation movement, this work by Marsh first clearly established that human labor in nature is often more destructive than helpful. He focuses on the role of forests and deforestation on the condition of waters and soils and on the possibility of people working to heal or restore damaged land.
Meine, Curt D. Aldo Leopold: His Life and Work (Madison: University of Wisconsin Press: 1988).
- This is the foremost biography of Leopold. The 2010 edition has a new preface and a contribution from Wendell Berry.
Muir, John. Our National Parks. (Boston: Houghton, Mifflin & Company, 1901).
Muir, John. Nature Writings. William Cronon, ed. (New York: Library of America, 1997.)
- Most of Muir’s writings were published first as magazine articles, and later collected into books. This collection contains many of the most influential pieces.
Nash, Roderick Frazier. Wilderness and the American Mind. 5^th ed. (New Haven: Yale, 2014)
- First published in 1967, this work was path breaking scholarship and has had enduring popularity with wilderness enthusiasts and activists. Several chapters have been added in subsequent additions, and the 5^th edition includes a forward by Char Miller.
Nash, Roderick Frazier. “‘Wild-dēor-ness,’ The Place of Wild Beasts.” Wilderness: the Edge of Knowledge. Maxine E. McCloskey, ed. (San Francisco: Sierra Club, 1970): 34-37.
Norton, Bryan G. “The Constancy of Leopold’s Land Ethic.” Conservation Biology 2(1) (1988): 93-102.
Noss, Reed. “Wilderness Recovery: Thinking Big in Restoration Ecology.” The Environmental Professional 13 (1991): 225-34. Reprinted in The Great New Wilderness Debate.
Oelschlaeger, Max. The Idea of Wilderness (New Haven: Yale, 1991).
- Extensive treatment of the major figures of the wilderness tradition. Includes a notable chapter on the poets Robinson Jeffers and Gary Snyder.
Plumwood, Val. “Wilderness Skepticism and Wilderness Dualism.” The Great New Wilderness Debate. J. Baird Callicott and Michael Nelson, eds. (Athens, GA: University of Georgia Press, 1998): 652-690.
Richardson, Robert. Henry Thoreau: A Life of the Mind (Oakland: University of California Press, 1988).
- This biography focuses on the intellectual development of Thoreau, with critical discussion of his written work.
Sachs, Aaron. The Humboldt Current: Nineteenth-Century Exploration and the Roots of American Environmentalism (New York: Viking, 2006.)
- Sachs provides an in depth discussion of the influence of romantic natural history, especially in the person of Alexander von Humboldt, on American culture and attitudes toward nature.
Smallwood, William Martin. Natural History and the American Mind (New York: AMS Press, 1967).
- Chronicles the development of natural history and its cultural importance in the American colonies and the young republic.
Spence, Mark David. Dispossessing the Wilderness: Indian Removal and the Making of the National Parks (Oxford: Oxford University Press, 1999).
Sutter, Paul. Driven Wild: How the Fight Against Automobiles Launched the Modern Wilderness Movement (Seattle: University of Washington Press, 2002).
Thoreau, Henry David. The Journal of Henry D. Thoreau. 14 volumes. B. Torrey and F. Allen, eds. (New York: Dover, 1962). Originally published in 1906.
Thoreau, Henry David. Walden: A Fully Annotated Edition. Jeffrey S. Cramer, ed. (New Haven: Yale University Press, 2004).
Thoreau, Henry David. Essays: A Fully Annotated Edition. Jeffrey S. Cramer, ed. (New Haven: Yale University Press, 2013).
- This volume contains “Walking” and his most important wilderness travel and natural history writings.
Turner, Frederick Jackson. The Frontier in American History (New York: Henry Holt & Company, 1921).
- Turner’s “frontier thesis” was originally given as an address in 1893, just after the census declared the end of the frontier. The idea gave fervor to the growing frontier nostalgia, and its accuracy as history has been long debated.
Turner, Jack. The Abstract Wild. (Tucson: University of Arizona Press, 1996).
- A manifesto and sustained argument against, among other things, the sufficiency of managed parks for the preservation of wildness.
Turner, James Morton. “From Woodcraft to ‘Leave No Trace’: Wilderness, Consumerism, and Environmentalism in Twentieth-Century America” Environmental History 7(3) (2002): 462-84. Reprinted in The Wilderness Debate Rages On.
Turner, James Morton. The Promise of Wilderness: American Environmental Politics since 1964 (Seattle: University of Washington Press, 2012).
- This work picks up the history where Nash’s book left off, successfully putting to rest any notion that public lands preservation has been less important to environmentalism since the 60s. This is the best source on the way different agencies and organizations have interpreted wilderness in applying the legal designation.
White, Lynn, Jr. “The Historical Roots of Our Ecological Crisis.” Science 155 (1967): 1203-07.
Woods, Mark. “Federal Wilderness Preservation in the United States: The Preservation of Wilderness?” The Great New Wilderness Debate. J. Baird Callicott and Michael Nelson, eds. (Athens, GA: University of Georgia Press, 1998): 131-153.
Worster, Donald. A Passion for Nature: The Life of John Muir (Oxford: Oxford University Press, 2008).
- An extensive biography of Muir by one of the foremost environmental historians.
Worster, Donald. Nature’s Economy: A History of Ecological Ideas. 2^nd ed. (Cambridge: Cambridge University Press, 1994).
- This is an important treatment of the romantic natural history tradition and its legacy in general, and of Thoreau in particular.

Author Information

David Henderson
Email: dghenderson@wcu.edu
Western Carolina University
U. S. A.

Ethics and Contrastivism

A contrastive theory of some concept holds that the concept in question only applies or fails to apply relative to a set of alternatives. Contrastivism has been applied to a wide range of philosophically important topics, including several topics in ethics. Contrastivism about reasons, for example, holds that whether some consideration is a reason for some action depends on what we are comparing that action to. The fact that your guests are vegetarian is a reason to make vegetable lasagna rather than make roast duck, but not a reason to make vegetable lasagna rather than make mushroom risotto. Contrastivism about obligation holds that what agents are obligated to do can likewise vary with the alternatives. So, for example, you may be obligated to take the book back to the library rather than leave it on your shelf, but not obligated to take the book back to the library rather than send it to the library with a friend. The article begins by clarifying what contrastivism is more generally, in order to see what motivates philosophers to accept contrastivism about some topic. Along the way, challenges and choice points facing the contrastivist will be highlighted. Attention is then given to exploring arguments for, and applications of, contrastivism to topics in ethics, including obligations, reasons, and freedom and responsibility.

Contrastivism in General
Contrastivism in Ethics
General Challenges
1. Setting the Contrast Class
2. Cross-Context Inferences
Conclusion
References and Further Reading

1. Contrastivism in General

In this section we will briefly introduce the broad range of topics that have received a contrastive treatment in areas outside of ethics, and see what kinds of arguments contrastivists about some concept deploy. This will give us a broad outline of contrastivism as a general kind of view in philosophy.

a. Contrastivism in Different Domains

i. Epistemology

One of the most well known applications of contrastivism relates to knowledge. There are also contrastive theories of justification and of belief, but I will focus here on knowledge. According to the traditional, non-contrastive conception of knowledge, it is a two-place relation holding between a subject and a proposition: Ksp—s knows that p. Contrastivism, on the other hand, holds that knowledge is a three-place relation holding between a subject, a proposition, and a contrast.

There are differences in conceptions of the contrast. Some contrastivists treat the contrast as a single proposition, q, incompatible with p, yielding Kspq—s knows that p rather than q. Others treat the contrast as a set of mutually exclusive propositions, including p, Q, yielding KspQ—s knows that p out of Q, where Q may be {p, q, r, s}. This difference is non-essential, at least for most purposes, since we can translate from Kspq to KspQ by letting Q = {p, q}, and we can translate from KspQ to Kspq, where Q = {p, r, s, t}, by letting q = r˅s˅t. Many examples used in arguments for contrastivism involve the phrase “rather than”, which generally contrasts two propositions (“s knows that p rather than q”). So for these examples, the single proposition conception of the contrast is more natural. Nevertheless, we will adopt the set of alternatives conception. As we will see in the section Contrastivism and Questions, this conception more directly represents the important contrastivist idea that contrastivity can be thought of as question-relativity.

Contrastivism about knowledge has its roots in the relevant alternatives contextualist theory of knowledge, developed in, for example, Dretske (1970) and Lewis (1996). According to this theory, whether a knowledge ascription, “s knows that p”, is true in a context depends on which alternatives to p are relevant in that context, and whether s can rule them out. As the context varies, the relevant alternatives may vary, and so whether a knowledge ascription is true can also vary. Relevant alternatives theorists have worked to spell out what makes an alternative relevant in a context, but have not yet produced a very satisfying picture. Contrastivists claim to do better: the relevant alternatives are provided by a question under discussion, which we have independent reason to accept in our theory of communication. For example, linguists (for example, Roberts, 201)) have argued that positing such a question under discussion helps explain various linguistic phenomena.

Contrastivists about knowledge claim several advantages over non-contrastive conceptions. The first kind of argument for contrastivism is linguistic: the theory can make better sense of a range of knowledge ascriptions, including explicitly contrastive ascriptions (“Ann knows that it’s a zebra rather than an ostrich”), ascriptions involving intonational stress (“Ann knows that the zebra is in the pen”), and ascriptions with a wh-complement (“Ann knows where the zebra pen is”). All of these ascriptions are plausibly treated as making reference to a question under discussion, or set of alternatives.

A second kind of argument appeals to theoretical advantages of contrastivism. For example, contrastivism promises to provide a solution to puzzles that have haunted epistemology, like the closure paradox. Moore knows that he has hands, and knows that if he has hands, then he is not a brain in a vat. But Moore does not know that he is not a brain in a vat. How can this be? Well, Moore knows that he has hands rather than flippers, but he does not know that he has hands rather than that he is a brain in a vat. So according to the contrastivist, this seemingly intractable paradox actually relies on a fallacious equivocation: we cannot assume that because Moore knows that he has hands rather than flippers that he therefore knows that he has hands rather than that he’s a brain in a vat. One way to read the closure paradox is as a puzzle about knowledge ascriptions: why do we ascribe Moore knowledge that he has hands but not knowledge that he is not a brain in a vat? But there is also a nonlinguistic side to the puzzle: Moore’s knowledge that he has hands seems incompatible with his ignorance about whether he’s a brain in a vat, given a very plausible closure principle. This does not have anything directly to do with knowledge ascriptions (though obviously intuitions must be drawn out by presenting knowledge ascriptions). It rather points out something troubling about the concept of knowledge: either it does not apply where we think it does, or it does not obey the kind of logic we think it does. The contrastivist solution is to say that knowledge is a contrastive concept, so that the puzzling question is simply ill-conceived. Moore’s knowledge that he has hands is in fact not incompatible with his ignorance about whether he’s a brain in a vat. I call this a theoretical argument for contrastivism, rather than a linguistic one, because it involves showing how contrastivism can resolve paradoxes involving the concept of knowledge, not merely deliver attractive interpretations about a range of knowledge ascriptions.

There are other theoretical arguments for contrastivism about knowledge. First, the theory allows us to track inquiry (See Schaffer, 2005a). Inquiry involves answering questions and ruling out alternatives, and the contrast argument place lets us keep track of the question we are answering, and the alternatives we have ruled out. A further theoretical motivation for contrastivism about knowledge comes from the idea that the most important theoretical and practical function of knowledge is to identify good sources of information (see especially Craig, 1990; Schaffer, 2005a). The contrastivist can add to this claim the observation that when we are looking for good sources of information, we have a particular question in mind (though it may be a quite general question). A good informant for one question (for example, why is it raining rather than snowing?) may not be a good informant for a different question (for example, why is it raining rather than not precipitating at all?). So a contrastive concept of knowledge would best explain its primary function.

These arguments, like other theoretical arguments (for example, Morton, 2012) aim to show that contrastivism lets us best make sense of the theoretical, as well as practical, role of knowledge. The specifics of how these arguments go are less important for our purposes here; the important point is that there are two broad classes of arguments for contrastivism about some concept: (i) linguistic arguments and (ii) theoretical arguments. This pattern carries over to different domains, including ethics. The line between the two kinds of arguments will not be sharp. This is due in part to the fact, noted above, that often theoretical puzzles about some concept have to be drawn out by appealing to ascriptions of that concept. Though many of the clearest motivations for contrastivism do involve ascriptions of the target concept, it is nevertheless important to keep in mind that contrastivism is more than simply a linguistic thesis and has more than simply linguistic advantages.

A special case of contrastivism about knowledge—one that is especially relevant for this article—is Sinnott-Armstrong’s (2006) contrastive account of moral knowledge. Sinnott-Armstrong applies contrastivist ideas developed in his own earlier work and by contrastivists like Schaffer to moral epistemology. An interesting twist is that Sinnott-Armstrong uses contrastivism as a route to a kind of moral skepticism—the view that we do not have moral knowledge. Here is the basic idea: though many explicitly contrastive knowledge ascriptions, like “I know that it is morally wrong to terminate the pregnancy using non-sterilized equipment rather than to terminate the pregnancy using sterilized equipment”, may well be true, we should suspend judgment about the truth of non-contrastive ascriptions like “I know that it is morally wrong to terminate the pregnancy“. All knowledge ascriptions require some set of alternatives before they can be evaluated for truth. If one is not provided explicitly, Sinnott-Armstrong argues, we should understand the ascriptions as “I know that p out of the relevant contrast class”. And this is where the skeptical turn appears: Sinnott-Armstrong argues that we should be relevance skeptics—we should suspend judgment about what the relevant contrast class is. Hence, we cannot evaluate the truth of the unrelativized knowledge claims. This is not quite the dogmatic skeptical claim that we lack moral knowledge. Instead, this is a Pyrrhonian skeptical thesis: we should suspend judgment about the truth of unrelativized attributions of moral knowledge (and of knowledge more generally). Nevertheless, it is notable that other contrastivists appeal to contrastivism to resolve skeptical paradoxes, while Sinnott-Armstrong uses contrastivism in an argument for a kind of skepticism.

ii. Philosophy of Science

Contrastive theses have also been offered in the philosophy of science. Traditional theories of explanation hold that the explanatory relation holds between two relata: pEq—p explains q. Contrastive theories of explanation hold that we need at least one, and possibly two, more argument places for contrasts. We may have pQEq—p out of Q (or “rather than any other member of Q”) explains q; pEqQ—p explains q out of Q; or pQ1EqQ2—p out of Q1 explains q out of Q2. Once again, there are both linguistic arguments and theoretical arguments for these contrastivist theories. For example, “The warm temperature explains why it is raining rather than snowing” may be true, while “The warm temperature explains why it is raining rather than not precipitating” may be false. (For more on contrastivism about explanation, see van Fraassen, 1980; Lipton, 1990 and Hitchcock, 1996.)

Relatedly, philosophers have offered contrastive theories of causation. Instead of holding that the causal relation is two place, eCf—e causes f—contrastivists hold that we need at least one, and possibly two, more argument places. Either eQ1Cf, eCfQ2, or eQ1CfQ2. Contrastivism purports to solve several puzzles facing traditional non-contrastive theories of causation, including causation by absences and the puzzle of saying what the cause of some event is. (See, for example, Schaffer, 2005b, 2012; and Hitchcock, 1996a, 1996b.)

Finally, philosophers have also offered contrastive theories of confirmation. According to this view, whether some evidence confirms a hypothesis depends on what we are comparing that hypothesis to. For example, the wet sidewalk confirms the hypothesis that it rained rather than that it was sunny all day, but does not confirm the hypothesis that it rained rather than that someone washed her bike on the sidewalk a few minutes ago. (See Chandler, 2007, 2013 and Fitelson, 2012 for discussion.)

b. Contrastivism and Questions

Contrastivists often claim that their theories are ones according to which the target concept is question-relative: relative to one question, the concept holds, while relative to another, it does not. For example, Schaffer (2005a, 2007a) argues that to know that p is to know that p as the answer to the contextually relevant question. So relative to a question like, “Is the bird a canary or a raven?”, you know that it is a canary—you know the answer to this question. But relative to the question, “Is the bird a canary or a goldfinch?”, you do not know that it is a canary—you do not know the answer to this second question.

Question-relativity is a natural idea for contrastivists. Questions—thought of as the informational contents of interrogative sentences, analogously to thinking of propositions as the informational contents of declarative sentences—are standardly treated as partitions over (some part of) logical space. These partitions divide logical space into cells, so that the possibilities are grouped in mutually exclusive classes. These partitions can also be thought of, then, as sets of mutually exclusive alternatives—each alternative in the set corresponds to one cell in the partition. Thus, relativizing a concept to questions simply amounts to relativizing it to sets of alternatives, which is exactly what the contrastivist wants to do. Different questions give us different partitions, and so correspond to different sets of alternatives.

To see this approach in action, return to the epistemological example. The question expressed by “Is the bird a canary or a raven?” is represented by the set of alternatives, {the bird is a canary, the bird is a raven}. Recall that this is a representation of a partition of (part of) logical space into two cells, one containing possibilities in which the bird is a canary and the other containing possibilities in which the bird is a raven. Similarly, the question expressed by “Is the bird a canary or a goldfinch?” is represented by the set of alternatives, {the bird is a canary, the bird is a goldfinch}. If we relativize knowledge to questions, then, we can explain why “You know the bird is a canary” is true when the relevant question is the first, but false when the relevant question is the second. For now, we will assume that in a given context, there is a relevant question which supplies the set of alternatives. In the section “Setting the Contrast Class” we will consider some problems for this assumption.

More directly relevant for ethics, contrastivists about normative concepts like “ought” and reasons have developed theories according to which these concepts are relativized to deliberative questions, or questions of what to do. In a given deliberative context—the kinds of context in which we ordinarily appeal to concepts like “ought” and reasons—there is some particular deliberative question we are trying to answer, since answering a deliberative question is just deciding what to do. This question supplies the set of alternatives relative to which claims about what we ought to do or have reason to do are interpreted.

c. Non-Exhaustivity and Resolution-Sensitivity

Thinking of a contrastive theory of some concept in terms of question-relativity helps bring out two important features of contrastivism. Both of these features are exploited by contrastivists.

First, questions may partition only part of, or some subspace of, logical space. Some possibilities may just not be relevant, for one reason for another, or may be ruled out by the presuppositions of the question. For example, if I ask which beer you want to try, possibilities in which you do not want to try any of the beers are plausibly not included. You can of course say that you do not want to try any beers, but this seems more like rejecting the question (admittedly in a conversationally cooperative way), rather than answering it—answering a question requires selecting one of the alternatives, or one cell of the partition. The relevance of this point for contrastivism is that the set of alternatives to which a concept is relativized may be non-exhaustive of logical space. This is most clear in the case of explicitly contrastive “rather than” ascriptions, like “You know that the bird is a canary rather than a raven”. Here, the contrastivist about knowledge will say that this sentence means that you know that the bird is a canary relative to the set {the bird is a canary, the bird is a raven}. Clearly there are many other possibilities—the bird could be a goldfinch, a crow, a robot made to look like a canary, or you could be dreaming. Relative to sets that include some of these other alternatives, you may not know that the bird is a canary. But since, on this view, knowledge claims are relativized to non-exhaustive sets of alternatives, it may still be true that you know that it is a canary relative to {the bird is a canary, the bird is a raven}.

Second, the possibilities that are partitioned can be grouped together in more or less fine-grained ways. Some distinctions between possibilities may be respected by the partition while others are smudged over. Compare the following two sets: {it’s a bird, it’s not a bird}, {it’s a canary, it’s a goldfinch, it’s a crow, it’s some other kind of bird, it’s a robot, it’s a hallucination, it’s some other kind of non-bird}. The second set makes distinctions between possibilities that are ignored in the first set. These sets differ in what Yalcin (2011) and Cariani (2013) call resolution: sets which make more fine-grained distinctions partition (parts of) logical space at a higher resolution. To say that some concept is resolution-sensitive, at least here, is to say that it is relativized to sets that may vary in resolution. Relative to a set at one resolution, the concept may hold of something, while relative to a set at a different resolution—either higher or lower—it may not.

2. Contrastivism in Ethics

While applications of contrastivism within epistemology and the philosophy of science are more well known, contrastivism has also been applied to a wide range of topics in ethics and normative philosophy more generally. We have already seen that contrastivist ideas have interesting applications in moral epistemology. This section introduces contrastivism about obligation, normative reasons, and freedom and moral responsibility. Having already introduced contrastivism more generally in the previous section, I will focus primarily on describing the specific motivations for the contrastive theories in ethics.

One application of contrastivist ideas in ethics that I will not discuss in detail is due to Driver (2012). Driver suggests a contrastive conception of luck, and makes use of this in her defense of a consequentialist treatment of moral luck. The central contrastivist claim is that no one, or no event, is lucky simpliciter. Rather, something is only lucky or unlucky relative to some contrasts. For example, a patient may be lucky to survive a serious illness rather than die from it, but not lucky to survive the serious illness, rather than not contract the illness in the first place.

a. Contrastivism about Obligation

The oldest application of contrastive ideas in ethics is contrastivism about obligation. Much of the work defending and developing contrastivism about obligation has focused primarily on developing contrastive semantic theories for the terms used to ascribe obligations, especially the deontic modal “ought”. This is not unexpected, since as we saw above, one important style of argument for contrastivism is linguistic in nature; contrastivism about obligation is no different. (Here I will conflate obligation and ought to stick more closely to the literature; the concept of obligation is better expressed using stronger deontic modals like “must” and “have to”.)

Contrastivism about obligation holds that what you ought to do can vary with the comparison being made. For example, though you ought to take the book back to the library rather than leave it on the shelf, it is not the case that you ought to take it back to the library rather than send it with me on my trip to the library.

It is important to distinguish the distinctive contrastivist claim from the much more widely accepted claim that what you ought to do depends on the available alternatives. If some option is the best one available, the non-contrastivist will say that it is what you ought to do. If circumstances change so that that option is no longer available, then obviously it is not the case that you ought to do it—it is not even an option. So what you ought to do has changed with the alternatives. But importantly, it has changed with the available alternatives. There is nothing surprising about this claim, and it is not the distinctive contrastivist claim. The distinctive contrastivist claim is that even holding the available alternatives fixed, what you ought to do can vary with the particular comparison. That is, claims about what you ought to do are only true or false relative to some particular set of alternatives, which may not include all of the available alternatives.

This puts us in a position to see one argument for contrastivism about obligation. Suppose that all of the following methods of getting to work are available: driving your SUV, taking the bus, riding your bike. The relevant factors here are environmental friendliness and getting some exercise. So riding your bike is best and driving your SUV is worst. The non-contrastivist will of course say that, in this case, you ought to ride your bike. And this is very plausible. But the following claim is also very plausible:

(1) You ought to take the bus rather than drive your SUV.

But since taking the bus is not the best available alternative—riding your bike is also an available alternative—it is hard to see how the non-contrastivist can explain the truth of (1). The contrastivist, on the other hand, has an easy time explaining this. Out of the set of alternatives {take the bus, drive your SUV}, taking the bus is the best. And what you ought to do out of a set of alternatives is the best alternative in that set. So even if there are better available alternatives, we can still make true “ought” claims about suboptimal alternatives, as long as they are the best in the relevant set of alternatives; using a “rather than” claim as in (1) is one way of making a set the relevant one.

The non-contrastivist can of course try to reinterpret claims like (1) so that they do not require relativizing “ought” to sets of alternatives. For example, we may read (1) as saying something like, “If you are going to either take the bus or drive your SUV, you ought to take the bus”. One problem for this reply, as emphasized in an epistemic context by Schaffer (2008), is that this requires reading “rather than” as contributing some kind of conditional. But this is not a plausible general theory about the contribution of “rather than” clauses. It is much more linguistically plausible to treat “rather than” as making explicit the comparison being made, as the contrastivist does.

An even more important source of motivation for contrastivism about obligation comes from the puzzles of deontic logic, the logic of obligation. Many of these puzzles have the following form: acceptable “ought” claims lead, via plausible inference rules, to unacceptable “ought” claims. Here is just one example, called Ross’s Paradox, since it is originally due to Alf Ross (1941). Suppose you promise your friend that you will mail a letter for her. Then (2) is true:

(2) You ought to mail the letter.

One inference rule that is validated by the standard semantics for “ought”, and by standard deontic logic, is the following:

Inheritance: If doing A entails doing B, then if you ought to do A, you ought to do B.

(This rule is usually stated treating “ought” as a propositional operator, read as “it ought to be that p”, instead of as (directly) ascribing an obligation, as in “you ought to A”. This goes beyond the scope of this article.) Besides being validated by orthodox treatments of “ought”, this inference rule has a lot of initial plausibility. One way to see this plausibility is to think of the special case in which doing B is a necessary means to doing A, and in that sense doing A entails doing B. If the only way to do something you ought to do requires doing B, then very plausibly, you thereby ought to do B. But inheritance leads to unacceptable results. Note that mailing the letter entails either mailing it or burning it, just because A entails (A or B), for any B. So from the acceptable “ought” claim (2), via Inheritance, (3) follows:

(3) You ought to either mail the letter or burn it.

While (2) is acceptable, (3) is not. It ascribes an obligation to you, mailing the letter or burning it, that you can satisfy by burning the letter. But burning the letter is not a way to do anything you ought to do.

The standard reply to Ross’s Paradox is to accept the consequence, that (3) is true, but explain its apparent unacceptability pragmatically. The basic idea is that (3) is weaker than something else we are in a position to say, namely (2). This is to appeal to Grice’s (1989) maxim of quantity, that we should say the strongest thing we are in a position to say. Saying something weaker, like (3), suggests that we are not in a position to say something stronger, like (2). But in this case, we are in a position to say (2)—in fact, we derived (3) from (2). There have been various challenges to this line of reply; see in particular Cariani (2013).

The contrastivist offers a different solution. The outline of the solution is that the inference from (2) to (3) involves an illicit shift in the set of alternatives to which the “ought” claims are relativized—and hence is equivocal. To see why, remember that the alternatives in a set of alternatives must be mutually exclusive. Then notice that “mail the letter” and “mail the letter or burn it” are not mutually exclusive; so they cannot be members of the same set of alternatives. Thus, (2) and (3) cannot be relativized to the same set of alternatives. In an ordinary context, (2) would be relativized to a set like {mail the letter, leave the letter on the table, give the letter back to your friend, burn the letter}. (3), on the other hand, must be relativized to a set that includes “mail the letter or burn it” as an option, such as {mail the letter or burn it, leave the letter on the table, give the letter back to your friend}. In terms of our distinction between the non-exhaustivity of a set of alternatives, and the resolution of a set of alternatives, inferences like the one from (2) to (3) require a shift in resolution: the second set of alternatives lumps together two options—”mail the letter” and “burn the letter”—that are distinct in the first set. Since the contrastivist about obligation holds that obligation claims are sensitive to the resolution of the set of alternatives to which they are relativized, she can hold that the shift in resolution generates a shift in the truth of the obligation claim.

The first thing to see is that we simply cannot infer (3) from (2): to do so would be to equivocate, since the set of alternatives has shifted. It would be like inferring “Chris Paul is tall”, when he’s playing in a professional basketball game, from the truth of “Chris Paul is tall” when he’s at his family reunion (crucial background: Chris Paul is taller than most other members of his family, but shorter than most basketball players). The comparison class has shifted, and “tall” ascriptions are very plausibly relativized to comparison classes—to count as tall, you have to be taller than most members of the relevant comparison class.

The second thing to notice is that, not only can we not infer (3) from (2), we can also say that (3) is actually false. That is because, very plausibly, out of the set {mail the letter or burn it, leave the letter on the table, give the letter back to your friend}, it is not true that you ought to mail the letter or burn it—this is not the best option in the set.

This is the basic outline for one kind of contrastivist solution to puzzles of deontic logic. Cariani (2013) offers an interestingly different kind of contrastivist solution. Cariani takes up the task of blocking problematic inferences, such as Ross’s Paradox, while retaining intuitively acceptable ones that also seem to be supported by rules like inheritance.

b. Contrastivism and Freedom

Another implementation of contrastivist ideas in ethics is Sinnott-Armstrong’s (2012) contrastive account of freedom and moral responsibility. Central questions in this domain concern whether an agent’s act is free, and hence whether the agent is responsible for the act. Responsibility skeptics argue that since we can always trace the causal history of an act back to causes outside the agent, no one is ever responsible. Their opponents give various responses to this argument, including that freedom and responsibility do not require a lack of causation from outside the agent.

The first application of contrastivism is to what agents are free from. For example, an agent’s act may be free from external physical constraints (for example, chains or a shove) or internal compulsions (for example, addiction), but not free from all preceding causes (for example, the initial conditions of the universe). Such an act would be free rather than the result of a shove or addiction, but not free rather than caused (via a long chain) by the initial conditions of the universe. Adopting this contrastive conception of freedom helps clarify the dispute between responsibility skeptics and their opponents: the debate is over which kind of constraint is the relevant one for attributing responsibility. (Sinnott-Armstrong himself once again denies that there is any one relevant kind of constraint, and so does not take sides in the dispute between responsibility skeptics and their opponents.)

This contrastive picture also explains conflicting intuitions about whether a given act is free. Ordinarily, perhaps, we have in mind constraints like chains or addictions. Most acts in question in debates about freedom and responsibility are free, rather than being constrained by these kinds of things. But what the responsibility skeptic does, is to draw our attention to another kind of constraint—that of causes outside the agent. Actions are very plausibly not free, rather than being caused at all. If the contrastivist about freedom is right that freedom is a contrastive concept, and that both of these kinds of freedom—freedom from constraints and freedom from preceding causes—are legitimate, then this explains why we may be puzzled by questions about whether a given action is free.

The second application of contrastivism is to what agents are free to do. Sinnott-Armstrong’s illustrative example is of an alcoholic, Al. Suppose Al drinks some whisky at 8pm on Tuesday. We may ask whether this act was free. It seems to depend on the contrasts. Depending on how we specify the details of the case, all of the following may be true:

Al’s drinking whisky rather than wine was free.
Al’s drinking whisky at 8pm rather than at 9pm was free.
Al’s drinking whisky rather than a non-alcoholic drink was not free.
Al’s drinking whisky on Tuesday rather than waiting until Wednesday was not free.

As Sinnott-Armstrong sums up the point: “Addicts never have no control at all in any circumstances…most people are free to choose out of some contrast classes but not out of others.” (Sinnott-Armstrong, 2012:145). So the question of whether Al’s act was free is, for the contrastivist, incomplete. To say whether an action was free, we have to specify what the contrast is—relative to some contrasts, it may be free while relative to others it may not be. The important question then becomes which contrasts are relevant for which purposes. In particular, we can ask which contrasts are relevant for blaming and holding responsible. So contrastivism has helped us isolate the important questions in the debate about moral responsibility.

A related position is contrastivism about legal responsibility. Schaffer (2010) applies his contrastive account of causation (described in the section Philosophy of Science) to the notion of legal causation. If we accept that there is a close connection between the claim that someone caused, in the legally relevant sense, some outcome and the claim that she is legally responsible for that outcome, this contrastive account of causation in the law leads naturally to a contrastive theory of legal responsibility.

c. Contrastivism about Normative Reasons

The last application of contrastivism to ethics is contrastivism about normative reasons. A normative reason for an action is a consideration that counts in favor of performing that action. For example, the fact that you promised to return the book is a reason to return it, and the fact that you are causing me pain is a reason to get off of my foot. Many philosophers think reasons are central to ethics, and to normativity more generally. If that is correct, then contrastivism about normative reasons will likely have widespread implications throughout ethics.

As with most other implementations of contrastivism, contrastivism about reasons can be motivated by linguistic considerations:

The fact that my guest is vegetarian is a reason to make vegetable lasagna rather than roast duck.
The fact that my guest is vegetarian is not a reason to make vegetable lasagna rather than mushroom risotto.

Both of these contrastive claims are true. But now we might want to know, “Is the fact that my guest is vegetarian a reason to make vegetable lasagna or not?”. This is to ask whether this fact is a non-contrastive reason. This question is hard to answer. What this seems to show is that whether this fact is a reason or not depends on the alternatives—that it is a contrastive reason.

There are various ways for the non-contrastivist to respond to this argument. In particular, she may try to provide non-contrastive analyses of these contrastive claims. For example, we may appeal to the fact that reasons have strengths or weights, and hold that some consideration is a reason to do A rather than B when it is a stronger (non-contrastive) reason to do A than it is to do B. In this way, we can explain the truth of claims like (4) and (5) without adopting a contrastive view of reasons.

There are various problems with this kind of strategy. For just one, recall the similar strategy for dealing with contrastive obligation claims discussed in the section ”Contrastivism About Obligation”. The idea there was to say that the “rather than” in these claims should be analyzed out as a conditional. The problem was that this is not particularly linguistically plausible, since “rather than” does not ordinarily contribute a conditional. This strategy for dealing with contrastive reason claims faces a similar problem. “Rather than” does not ordinarily mean “stronger than”; instead, “rather than” should be understood as introducing contrasts.

Besides linguistic arguments, the second major kind of argument for contrastivism in some domain is theoretical. Recall that these kinds of arguments are not based primarily on contrastivism’s ability to give attractive interpretations of ascriptions of the target concept—in this case, reasons. Rather, they aim to show that given some theoretical role or property of the target the concept would be best explained by a contrastive view of the concept. A theoretical argument for contrastivism about reasons is that it best makes sense of the connection between reasons and the promotion of various objectives, like desires or values. A schematic statement of this very common idea is the following:

Promotion: Consideration R is a reason to perform act A if R explains why A-ing would promote objective O.

Again, an objective is some valuable thing to be promoted. Different theories will say different things: desire-based theories think reasons are tied to the promotion of the objects of desires, value-based theories think reasons are tied to the promotion of values like justice or goodness, and so on. No matter which of these theories we accept, we have to say what it takes for some action to promote an objective.

Snedegar (2014b) argues that the best way to do this is to adopt a contrastive picture. Relative to some contrasts an action may promote an objective, while relative to another, it may not. Suppose the relevant objective is contributing to the relief of hunger in the third world. This objective is not promoted by donating to an unreliable charity (they only get the money where it should go 20% of the time) rather than donating to a reliable charity. But it is promoted by donating to an unreliable charity rather than spending the money on an expensive dinner for myself. Hence, this objective gives me a reason to donate to the unreliable charity rather than spend the money on an expensive dinner, but does not give me a reason to donate to the unreliable charity rather than donate to the reliable charity. Non-contrastive views of promotion will deliver the verdict that this objective gives me no reason whatsoever to donate to the unreliable charity. So it is hard for them to explain the fact that it gives me a reason to donate to the unreliable charity rather than spending the money on an expensive dinner.

We have seen both linguistic and theoretical motivations for contrastivism about reasons. As we saw at the beginning of this section, reasons are often taken to be central to ethics and normativity more generally. So contrastivism about reasons is likely to have many upshots throughout ethics and normative philosophy. One nice thing about this is that it gives us a huge swathe of philosophy against which to test contrastivism about reasons: contrastivism may lead to exciting insights in normative philosophy, or it may lead to unacceptable results. Either way, this seems to be a fruitful area for research.

3. General Challenges

To close, consider some general challenges facing contrastivism of any variety. The specific form of these challenges, and the plausible responses, will likely vary from domain to domain. When it is necessary to apply the challenge to a concrete contrastivist theory, one from ethics will be chosen. As much as possible, however, the article remains at a general level, because it is instructive to think about the general shape of the challenges, as they face the contrastivist qua contrastivist.

a. Setting the Contrast Class

The first few challenges are interrelated, and have to do with setting the relevant contrast class. First, contrastivists face the challenge of saying what set of alternatives a given claim should be relativized to. For explicitly contrastive ascriptions of a concept, for example those using “rather than”, it is straightforward: the “rather than” clause makes the alternatives explicit. But for ascriptions that are not explicitly contrastive, the contrastivist has to provide some way of settling what the relevant set of alternatives is, or else admit that these unrelativized claims are not truth-evaluable, or at least that we should suspend judgment about their truth. To be satisfactory, this should be done in a relatively principled way. Otherwise, the contrastivist may face charges of fixing the contrasts in an ad hoc way to get the results she wants.

We have already seen one popular way to answer this challenge. This is to appeal to a question under discussion in the context. Linguists and philosophers of language have given arguments independent of contrastivism for the inclusion of such a device in our theory of communication. For example, it is useful in interpreting intonational stress (see Rooth, 1992) and in explaining several kinds of pragmatic phenomena (see Roberts, 2012). The contrastivist can exploit this: the question under discussion fixes the set of alternatives relative to which the ascription is interpreted.

But there are other options. Rather than appealing to a question under discussion, the contrastivist may instead appeal to the speaker’s intention, to features of the assessor’s context, or even to features of the subject (of the ascription) or her context. As we have already seen, one prominent contrastivist, Walter Sinnott-Armstrong, argues for a very different solution to the problem of determining the contrast class. Sinnott-Armstrong (2004, 2006) argues that no way of determining relevance is correct, and that we should instead be relevance skeptics. We should simply suspend judgment about the content and truth of non-relativized claims employing a contrastive concept. Sinnott-Armstrong’s arguments are challenging, and if the contrastivist wants to avoid his skepticism, she needs to grapple with them. One way to gain traction here, though this goes beyond the scope of this article, is to seek independent evidence for the existence of a relevant question under discussion in explanations of natural language phenomena. Linguists have developed powerful explanatory theories of various natural language phenomena using questions under discussion. So even if specific proposals about how to determine the relevant contrast class, or question under discussion, face challenges, we at least have some reason to be optimistic that there is such a relevant contrast class or question.

A second and related challenge is that contrastivism delivers apparently objectionable results, as long as the relevant contrast class is set up in the right way. This problem is perhaps sharpest for the contrastivist about obligation. You may be obligated to do all kinds of terrible or crazy things, because the contrast class is crazy. For example, the contrastivist about obligation will say that you are obligated to burn down your neighbor’s house while she is at work—as long as the relevant alternatives are worse than this. So you are obligated to burn down her house while she is at work rather than burn it down with her inside. This is even more objectionable when we remember that these need not be the only options open to you—it may be perfectly possible for you to take her a plate of freshly baked cookies, or to simply stay at home and watch television, instead. Still, the contrastivist will say that you are obligated to burn down her house while she is at work, as long as the relevant alternative is burning it down with her inside.

The contrastivist about obligation is committed to this result, when paired with any plausible theory about what an agent is obligated to do out of a given contrast class. But it is not clear how serious this problem actually is. The explicitly contrastive claim, “You are obligated to burn down her house while she’s at work rather than burn it down when she’s inside” is not obviously false. After all, burning it down while she’s at work is clearly better than burning it down while she’s inside. The bare, non-contrastive claim, “You are obligated to burn down her house while she’s at work” does sound obviously false. But the contrastivist is only committed to the truth of this claim when the only relevant alternatives are things like “burn it down while she’s inside” (or even worse alternatives). In any ordinary context—for example, a context in which you could take her a plate of freshly baked cookies, instead—these will not be the only relevant alternatives. In fact, they are unlikely to be relevant alternatives at all, at least before they are mentioned. In these ordinary contexts, the contrastivist about obligation will not be committed to the truth of the objectionable non-contrastive claim. The details of this solution will depend on what our theory tells us about fixing the relevant set of alternatives, but it should be clear that the contrastivist has options here.

A closely related problem is raised against contrastive theories of moral reasons by Andrew Jordan. Jordan argues that some actions should be, and are, performed in a whole-hearted way—that is, without considering alternatives at all. The virtuous person will simply see that taking her sick pet to the vet is the thing to do and will not consider alternatives, or take into account reasons for alternatives, for example, the potentially high cost. So the reasons favoring the whole-hearted action do not seem to be relativized to any contrast class at all.

This problem only arises if the contrastivist about reasons holds that the contrast class is fixed by the options the subject is considering. But as we have seen, there are many more options for the contrastivist. It is not clear, for example, how this problem could arise on a speaker contextualist theory. So this is not a problem for the contrastivist as such.

Though these last two challenges are not serious problems for contrastivism as such, they are useful in thinking about the first challenge—that of saying what fixes the contrast class for a given claim. The problem of crazy verdicts resulting from crazy contrast classes puts pressure on a very simple version of speaker contextualism, according to which the relevant contrast class is wholly fixed by the speaker’s intentions. As long as the speaker intends a crazy contrast class, the objectionable ascriptions may come out true. This kind of contrastivist would then need to try to explain why this result is not actually objectionable. Jordan’s problem of whole-hearted action puts pressure on a version of contrastivism according to which the relevant contrast class is wholly determined by what the agent is considering—if the virtuous agent is not considering any alternatives, then this version of contrastivism could not supply a contrast class.

Another problem in this vein is harder to articulate in a sharp way. It stems from the idea that there must be an answer to whether the concept really applies, over and above whether it applies relative to any particular set of alternatives. In the case of “ought”, for example, there is a feeling that there must be something that we really ought to do. We can imagine the objector saying, in an exasperated tone, “I know I ought to take the bus rather than drive my SUV. What I want to know is, ought I take the bus?”. Read straightforwardly, this objection is just a rejection of the central thesis of contrastivism. Read in that way, there is not much the contrastivist can say.

There is another, more contrastivist-friendly way to construe this idea. The idea may be that, though there are lots of true claims about when I ought to or have reason to perform some action rather than some other action, in certain kinds of deliberation and theorizing, we are interested in “oughts” and in reasons with some kind of special status. The contrastivist can accommodate this idea by identifying special contrast classes, and claiming that they are relevant in the cases the objector has in mind. Some good candidates include (i) a trivial contrast class, {A, ~A}, (ii) an exhaustive contrast class that includes every possibility open to the agent, (iii) a maximally fine-grained contrast class, and (iv) a contrast class that makes all morally relevant distinctions. These are not mutually exclusive options, of course—for example, all four could be construed as exhaustive sets of alternatives. The contrastivist can hold that some reasons or obligations, for example, moral reasons or obligations, are always relativized to one of these special kinds of contrast class, while other reasons and obligations are not. This is all perfectly consistent with contrastivism, and lets us capture something very close to the idea that there is something we really ought to do or really have reason to do.

b. Cross-Context Inferences

A very different kind of challenge involves cross-context inferences. The central feature of contrastivism, that lets it solve puzzles facing non-contrastive theories, is that a concept may apply relative to one set of alternatives without applying relative to others. For example, just because we know that you ought to A rather than B, that does not tell us anything about whether you ought to A rather than C. This central feature leads to a very important challenge: sometimes, knowing that a concept applies relative to some alternatives should tell us whether it applies relative to certain other alternatives. For example, if I know that I ought to A rather than either of B or C (out of {A, B, C}), our theory should guarantee that I ought to A rather than B (out of {A, B}). Similarly, if I ought to A rather than B and I ought to B rather than C, then our theory should guarantee that I ought to A rather than C.

The advantages of contrastivism come from letting the application of a concept vary with the alternatives. What this problem shows is that we have to constrain this variation in certain ways. The strategy adopted by contrastivists who have addressed this problem is to appeal to some non-contrastive foundation on which the application of the concept depends. For example, contrastivists about “ought” who have addressed this problem appeal to a contrast-invariant ranking of alternatives, and let the application of “ought” depend on this ranking in ways that deliver the necessary constraints.

4. Conclusion

Contrastivism has been applied across much of philosophy, and it is no wonder why. It promises to resolve the closure paradox in epistemology, provide the best theory of explanation, perhaps the central concept in philosophy and science, and finally give a true theory of causation. And that is before we even broach the field of ethics. There, contrastivism promises to resolve—or at least shed serious light on—the paradoxes of deontic logic, the problem of determinism, and provide an account of reasons for action. There is much more work to be done in making good on these promises. But at the very least, this appears to be a very fruitful research program—especially in ethics, where less work has been done.

5. References and Further Reading

Baumann, P. 2008. “Problems for Sinnott-Armstrong’s Moral Contrastivism.” The Philosophical Quarterly 58(232): 463-470.
- Argues that contrastivism about knowledge makes bad predictions in cases of “crazy contrast classes”.
Blaauw, M. (ed.) 2012. Contrastivism in Philosophy. Routledge.
- A collection of papers demonstrating the breadth of the contrastivist program in philosophy, including several in ethics.
Cariani, F. 2013. “Ought and Resolution Semantics.” Noûs 47(3): 534-558.
- Develops a sophisticated contrastive semantic theory for “ought”.
Chandler, J. 2007. “Solving the Tacking Problem with Contrast Classes.” British Journal for the Philosophy of Science 58(3): 489-502.
- Uses contrastive confirmation to solve an important problem in confirmation theory.
Chandler, J. 2013. “Contrastive Confirmation: Some Competing Accounts.” Synthese 190(1): 129-138.
Craig, W. 1990. Knowledge and the State of Nature: An Essay in Conceptual Synthesis. Oxford University Press.
- Argues that the central function of the concept of knowledge is to identify good sources of information, and develops a theory of knowledge based on this conception.
Dretske, F. 1970. “Epistemic Operators.” Journal of Philosophy 67: 1007-1023.
- Early version of the relevant alternatives theory of knowledge, direct predecessor of contrastivism.
Driver, J. 2012. “Luck and Fortune in Moral Evaluation.” In Blaauw (ed.), Contrastivism in Philosophy. Routledge, 154-172.
- Sketches a contrastive account of luck, and applies it to the problem of moral luck.
Finlay, S. 2009. “Oughts and Ends.” Philosophical Studies 143(3): 315-340.
Finlay, S. 2014. Confusion of Tongues: A Theory of Normative Language. Oxford University Press.
- Develops a theory of “ought” which makes use of contrastivist machinery in the service of providing a comprehensive theory of normativity.
Finlay, S. and Snedegar, J. 2014. “One Ought Too Many.” Philosophy and Phenomenological Research 89(1): 102-124.
- Defends a uniform, propositional operator semantics for “ought”, making crucial use of contrastivism.
Fitelson, B. 2012. “Contrastive Bayesianism.” In Blaauw (ed.), Contrastivism in Philosophy. Routledge, 64-87.
- Discussion of contrastive theories of confirmation.
van Fraassen, B. 1980. The Scientific Image. Oxford University Press.
- Influential development of a contrastive theory of explanation.
Grice, H. P. 1989. “Logic and Conversation.” In Grice, Studies in the Way of Words. Harvard University Press, 22-40.
- Classic discussion of conversational implicature, where speakers communicate more than they literally say.
Groenendijk, J. and Stokhof, M. 1997. “Questions.” In van Benthem, J. and ter Meulen, A. (eds.), Handbook of Logic and Language. Elsevier Science Publishers, 1055-1124.
- Detailed discussion of the semantics of questions, including the partition/set of alternatives semantics.
Hamblin, C. L. 1958. “Questions.” Australasian Journal of Philosophy 36: 159-168.
- Early development of the partition semantics for questions.
Higginbotham, J. 1996. “The Semantics of Questions.” In Lappin, S. (ed.), The Handbook of Contemporary Semantic Theory. Oxford University Press, 361-383.
Hitchcock, C. 1996a. “The Role of Contrast in Causal and Explanatory Claims.” Synthese 107: 395-419.
Hitchcock, C. 1996b. “Farewell to Binary Causation.” Canadian Journal of Philosophy 26: 267-282.
- Development of a contrastive theory of causation.
Jackson, F. 1985. “On the Semantics and Logic of Obligation.” Mind 94(374): 177-195.
- Development of a contrastive theory of obligation, motivated by puzzles from deontic logic.
Jackson, F. and Pargetter, R. 1986. “Oughts, Options, and Actualism.” Philosophical Review 95(2): 233-255.
- Development of a contrastive theory of obligation.
Jordan, A. 2014. “Whole-Hearted Motivation and Relevant Alternatives: A Problem for the Contrastivist Account of Moral Reasons.” Ethical Theory and Moral Practice 17(5): 835-845.
Karjalainen, A. and Morton, A. 2003. “Contrastive Knowledge.” Philosophical Explorations 6(2): 74-89.
- Argues for a contrastive conception of knowledge.
Lewis, D. 1996. “Elusive Knowledge.” Australasian Journal of Philosophy 74: 549-567.
- Influential development of the relevant alternatives theory of knowledge, a direct predecessor of contrastivism about knowledge.
Lipton, P. 1990. “Contrastive Explanation.” Royal Institute for Philosophy Supplement 27: 247-266.
- Development of a contrastive theory of explanation.
McNamara, P. 2014. “Deontic Logic.” In Zalta (ed.), Stanford Encyclopedia of Philosophy.
- Detailed overview of deontic logic, including the puzzles that motivate contrastivism about obligation.
Morton, A. 2012. “Contrastive Knowledge.” In Blaauw (ed.), Contrastivism in Philosophy. Routledge, 101-115.
- Gives primarily theoretical, rather than linguistic, arguments for contrastivism about knowledge.
Roberts, C. 2012. “Information Structure in Discourse: Towards an Integrated Formal Theory of Pragmatics.” Semantics and Pragmatics 5: 1-69.
- Detailed development of a formal pragmatic theory making crucial use of questions under discussion.
Rooth, M. 1992. “A Theory of Focus Interpretation.” Natural Language Semantics 1: 75-116.
- Develops a theory for interpreting focus (for example, intonational stress) in natural language, making crucial use of sets of alternatives.
Ross, J. 2009. Acceptance and Practical Reason. PhD Thesis, Rutgers University, Chapter 9.
- Gives arguments for a contrastive treatment of normative reasons.
Schaffer, J. 2004. “From Contextualism to Contrastivism.” Philosophical Studies 119(1-2): 73-104.
- Argues that contrastivism about knowledge is superior to standard forms of contextualism.
Schaffer, J. 2005a. “Contrastive Knowledge.” In Gendler and Hawthorne (eds.), Oxford Studies in Epistemology, Vol. 1. Oxford University Press, 235-271.
- Argues for and develops a contrastive theory of knowledge.
Schaffer, J. 2005b. ‘Contrastive Causation.’ The Philosophical Review 114: 327-358.
- Argues for and develops a contrastive theory of causation.
Schaffer, J. 2007a. “Knowing the Answer.” Philosophy and Phenomenological Research 75(2): 383-403.
- Argues for and develops a contrastive theory of knowledge, based primarily on knowledge-wh ascriptions—for example, “knows who”, “knows whether”.
Schaffer, J. 2007b. “Closure, Contrast, and Answer.” Philosophical Studies 133(2): 233-255.
- Shows how a contrastivist about knowledge can explain inferences supported by closure principles, even though the contrastivist has to reject standard closure principles.
Schaffer, J. 2008. “The Contrast-Sensitivity of Knowledge Ascriptions.” Social Epistemology 22(3): 235-245.
- Argues against non-contrastivist treatments of the linguistic data used to motivate contrastivism.
Schaffer, J. 2010. “Contrastive Causation in the Law.” Legal Theory 16: 259-297.
- Applies contrastivism about causation to causation as appealed to in judgments of legal responsibility.
Schaffer, J. 2012. “Causal Contextualisms.” In Blaauw (ed.), Contrastivism in Philosophy. Routledge, 35-63.
- Discussion of contrastivism about causation, with a somewhat pessimistic conclusion for its ultimate prospects.
Sinnott-Armstrong, W. 2004. “Classy Pyrrhonism.” In W. Sinnott-Armstrong (ed.), Pyrrhonian Skepticism. Oxford University Press, 188-207.
- Argues for contrastivism about knowledge, but uses this theory to support Pyrrhonian skepticism about unrelativized knowledge claims by arguing for skepticism about the notion of a “relevant” contrast class.
Sinnott-Armstrong, W. 2006. Moral Skepticisms. Oxford University Press.
- Applies the ideas in Sinnott-Armstrong (2004) to moral epistemology.
Sinnott-Armstrong, W. 2008a. “A Contrastivist Manifesto.” Social Epistemology 22(3): 257-270.
- An overview of contrastivism across philosophy.
Sinnott-Armstrong, W. 2008b. “Replies to Hough, Baumann, and Blaauw.” Philosophical Quarterly 58(232): 478-488.
- Replies to Baumann’s (2008) “crazy contrast class” objection to contrastivism about knowledge.
Sinnott-Armstrong, W. 2012. “Free Contrastivism.” In Blaauw (ed.), Contrastivism in Philosophy. Routledge, 134-153.
- Shows how a contrastive account of freedom can clarify disputes in discussions of determinism and moral responsibility.
Sloman, A. 1970. “Ought and Better.” Mind 79(315): 385-394.
- Early development of a contrastive view of obligation.
Snedegar, J. 2012. “Contrastive Semantics for Deontic Modals.” In Blaauw (ed.), Contrastivism in Philosophy. Routledge, 116-133.
- Argues for a contrastive treatment of deontic modals like “ought”, “must”, and “may”.
Snedegar, J. 2013a. “Negative Reason Existentials.” Thought 2(2): 108-116.
- Shows how to use contrastivism to solve a puzzle about claims like “There’s no reason to cry over spilled milk.”
Snedegar, J. 2013b. “Reason Claims and Contrastivism about Reasons.” Philosophical Studies 166(2): 231-242.
- Argues for contrastivism about normative reasons on the basis of reason claims employing “rather than”.
Snedegar, J. 2014a. “Deontic Reasoning across Contexts.” In F. Cariani, and others (eds.), Deontic Logic and Normative Systems, Vol. 12, Springer Lecture Notes in Computer Science, 2014a: 208-223.
- Shows how a contrastivist about obligation can recapture intuitive inferences supported by inference rules the contrastivist rejects.
Snedegar, J. 2014b. “Contrastive Reasons and Promotion.” Ethics 125 (2014b): 39-63.
- Argues for and develops a version of contrastivism, based on the idea that normative reasons are tied to the promotion of objectives.
Yalcin, S. 2011. “Nonfactualism about Epistemic Modality.” In Egan, A. and Weatherson, B. (eds.), Epistemic Modality. Oxford University Press, 295-332.
- Introduces the idea of resolution-sensitivity in a discussion of epistemic modality.

Author Information

Justin Snedegar
Email: js280@st-andrews.ac.uk
University of St Andrews
United Kingdom

Leibniz: Logic

The revolutionary ideas of Gottfried Wilhelm Leibniz (1646-1716) on logic were developed by him between 1670 and 1690. The ideas can be divided into four areas: the Syllogism, the Universal Calculus, Propositional Logic, and Modal Logic.

These revolutionary ideas remained hidden in the Archive of the Royal Library in Hanover until 1903 when the French mathematician Louis Couturat published the Opuscules et fragments inédits de Leibniz. Couturat was a great admirer of Leibniz’s thinking in general, and he saw in Leibniz a brilliant forerunner of modern logic. Nevertheless he came to the conclusion that Leibniz’s logic had largely failed and that in general the so-called “intensional” approach to logic was necessarily bound to fail. Similarly, in their standard historiography of logic, W. & M. Kneale (1962) maintained that Leibniz “never succeeded in producing a calculus which covered even the whole theory of the syllogism”. Even in recent years, scholars like Liske (1994), Swoyer (1995), and Schupp (2000) argued that Leibniz’s intensional conception must give rise to inconsistencies and paradoxes.

On the other hand, starting with Dürr (1930), Rescher (1954), and Kauppi (1960), a certain rehabilitation of Leibniz’s intensional logic may be observed which was by and by supported and supplemented by Poser (1969), Ishiguro (1972), Rescher (1979), Burkhardt (1980), Schupp (1982), and Mugnai (1992). However, the full wealth of Leibniz’s logical ideas became visible only in Lenzen (1990), (2004a), and (2004b), where the many pieces and fragments were joined together to an impressive system of four calculi:

The algebra of concepts L1 (which turns out to be deductively equivalent to the Boolean algebra of sets)
The quantificational system L2 (where “indefinite concepts” function as quantifiers ranging over concepts)
A propositional calculus of strict implication (obtained from L1 by the strict analogy between the containment-relation among concepts and the inference-relation among propositions)
The so-called “Plus-Minus-Calculus” (which is to be viewed as a theory of set-theoretical containment, “addition,” and “subtraction”).

Leibniz’s Logical Works
Works on the Theory of the Syllogism
Works on the Universal Calculus
Leibniz’s Calculus of Strict Implication
Works on Modal Logic
1. Possible-Worlds-Semantics for Alethic Modalities
2. Basic Principles of Deontic Logic
References and Further Reading
1. Abbreviations for Leibniz’s works
2. Secondary Literature

1. Leibniz’s Logical Works

Throughout his life (beginning in 1646 in Leipzig and ending in 1716 in Hanover), Gottfried Wilhelm Leibniz did not publish a single paper on logic, except perhaps for the mathematical dissertation “De Arte Combinatoria” and the juridical disputation “De Conditionibus” (GP 4, 27-104 and AE IV, 1, 97-150; the abbreviations for Leibniz’s works are resolved in section 6). The former work deals with some issues in the theory of the syllogism, while the latter contains investigations of what is nowadays called deontic logic. Leibniz’s main aim in logic, however, was to extend the traditional syllogistic to a “Universal Calculus.” Although there exist several drafts of such a calculus which seem to have been composed for publication, none of them was eventually sent to press. So Leibniz’s logical essays appeared only posthumously. The early editions of his philosophical works, however, contained only a small selection of logical papers. It was not before the beginning of the 20^th century that the majority of his logical fragments became generally accessible by the valuable edition of Louis Couturat.

Since only few manuscripts were dated by Leibniz, his logical oeuvre shall not be described here in chronological order but from a merely systematic point of view by distinguishing four groups:

Works on the Theory of the Syllogism
Works on the Universal Calculus
Works on Propositional Logic
Works on Modal Logic.

2. Works on the Theory of the Syllogism

Leibniz’s innovations within the theory of the syllogism comprise at least three topics:

(a) An “Axiomatization” of the theory of the syllogism, that is, a reduction of the traditional inferences to a small number of basic laws which are sufficient to derive all other syllogisms.

(b) The development of the semantics of so-called “characteristic numbers” for evaluating the logical validity of a syllogistic inference.

(c) The invention of two sorts of graphical devices, that is to say, linear diagrams and (later) so-called “Euler-circles,” as a heuristic for checking the validity of a syllogism.

a. Axiomatization of the Theory of the Syllogism

In the 17^th century, logic was still strongly influenced, if not dominated, by syllogistic, that is, by the traditional theory of the four categorical forms:

Universal affirmative proposition (UA) Every S is P SaP

Universal negative proposition (UN) No S is P SeP

Particular affirmative proposition (PA) Some S is P SiP

Particular negative proposition (PN) Some S isn’t P SoP

A typical textbook of that time is the famous “Logique de Port Royal” (Arnauld & Nicole (1683)) which, apart from an introductory investigation of ideas, concepts, and propositions in general, basically consists of:

(i) The theory of the so-called “simple” laws of subalternation, opposition, and conversion;

(ii) The theory of the syllogistic “moods” which are classified into four different “figures” for which specific rules hold.

As Leibniz defines it, a “subalternation takes place whenever a particular proposition is inferred from the corresponding universal proposition” (Cout, 80), that is:

SUB 1 SaP → SiP

SUB 2 SeP → SoP.

According to the modern analysis of the categorical forms in terms of first order logic, these laws are not strictly valid but hold only under the assumption that the subject term S is not empty. This problem of “existential import” will be discussed below.

The theory of opposition first has to determine which propositions are contradictories of each other in the sense that they can neither be together true nor be together false. Clearly, the PN is the contradictory, or negation, of the UA, while the PA is the negation of the UN:

OPP 1 ¬SaP ↔ SoP

OPP 2 ¬SeP ↔ SiP.

The next task is to determine which propositions are contraries to each other in the sense that they cannot be together true, while they may well be together false. As Leibniz states in “Theorem 6: The universal affirmative and the universal negative are contrary to each other” (Cout, 82). Finally, two propositions are said to be subcontraries if they cannot be together false while it is possible that are together true. As Leibniz notes in another theorem, the two particular propositions, SiP and SoP, are logically related to each other in this way. The theory of subalternation and opposition is often summarized in the familiar “Square of Opposition”:

In the paper “De formis syllogismorum Mathematice definiendis” written around 1682 (Cout, 410-416, and the text-critical edition in AE VI, 4, 496-505) Leibniz tackled the task of “axiomatizing” the theory of the syllogistic figures and moods by reducing them to a small number of basic principles. The “Fundamentum syllogisticum”, that is, the axiomatic basis of the theory of the syllogism, is the “Dictum de omni et nullo” (The saying of ‘all’ and ‘none’):

If a total C falls within another total D, or if the total C falls outside D, then whatever is in C, also falls within D (in the former case) or outside D (in the latter case) (Cout, 410-411).

These laws warrant the validity of the following “perfect” moods of the “First Figure”:

BARBARA CaD, BaC → BaD

CELARENT CeD, BaC → BeD

DARII CaD, BiC → BiD

FERIO CeD, BiC → BoD.

On the one hand, if the second premise of the affirmative moods BARBARA and DARII is satisfied, that is, if B is either totally or partially contained in D, then, according to the “Dictum de Omni”, also B must be either totally or partially contained in D since, by the first premise, C is entirely contained in D. Similarly the negative moods CELARENT and FERIO follow from the “Dictum de Nullo”: “B is either totally or partially contained in C; but the entire C falls outside D; hence also B either totally or partially falls outside D” (Cout, 411).

Next Leibniz derives the laws of subalternation from the syllogisms DARII and FERIO by substituting ‘B’ for ‘C’ and ‘C’ for ‘D’, respectively. This derivation (and hence also the validity of the laws of subalternation) tacitly presupposes the following principle which Leibniz considered as an “identity”:

SOME BiB.

With the help of the laws of subalternation, BARBARA and CELARENT may be “weakened” into

BARBARI CaD, BaC → BiD

CELARO CeD, BaC → BoD.

Thus the First Figure altogether has six valid moods, from which one obtains six moods of the Second and six of the Third Figure by means of a logical inference-scheme called “Regressus”:

REGRESS If a conclusion Q logically follows from premises P₁, P₂, but if Q is false, then one of the premises must be false.

When Leibniz carefully carries out these derivations, he presupposes the laws of opposition, Opp 1, Opp 2. Finally, six valid moods of the Fourth Figure can be derived from corresponding moods of the First Figure with the help of the laws of conversions.According to traditional doctrines, the PA and the UN may be converted “simpliciter”, while the UA can only be converted “per accidens”:

CONV 1 BiD → DiB

CONV 2 BeD → DeB

CONV 3 BaD → DiB.

As Leibniz shows, these laws can in turn be derived from some previously proven syllogisms with the help of the “identical” proposition:

ALL BaB.

Furthermore one easily obtains another law of conversion according to which the UN can also be converted “accidentally”:

CONV 4 BeD → DoB.

The announced derivation of the moods of the Fourth Figure was not carried out in the fragment “De formis syllogismorum Mathematice definiendis” which just breaks off with a reference to “Figura Quarta”. It may, however, be found in the manuscript LH IV, 6, 14, 3 which, unfortunately, was only partially edited in Cout, 204. At any rate, Leibniz managed to prove that all valid moods can be reduced to the “Fundamentum syllogisticum” in conjunction with the laws of opposition, the inference scheme “Regressus”, and the “identical” propositions SOME and ALL.

Now while ALL is an identity or theorem of first order logic, ∀x(Bx → Bx), SOME is nowadays interpreted as ∃x(Bx ∧ Bx). This formula is equivalent to ∃x(Bx), that is, to the assumption that there “exists” at least one x such that x is B. Hence the laws of subalternation presuppose that each concept B (which can occupy the position of the subject of a categorical form) is “non-empty”. Leibniz discussed this problem of “existential import” in a paper entitled “Difficultates quaedam logicae” (GP 7, 211-217) where he distinguished two kinds of “existence”: Actual existence of the individuals inhabiting our real world vs. merely possible subsistence of individuals “in the region of ideas”. According to Leibniz, logical inferences should always be evaluated with reference to “the region of ideas”, that is, the larger set of all possible individuals. Therefore all that is required for the validity of subalternation is that the term B occupying the position of the subject of a categorical form has a non-empty extension within the domain of possible individuals. As will turn out below (compare the definition of an extensional interpretation of L1 in section 3.1), this weak condition of “existential import” becomes tantamount to the assumption that the respective concept B is self-consistent!

b. The Semantics of “Characteristic Numbers”

In a series of papers of April 1679, Leibniz elaborated the idea of assigning natural numbers to the subject and predicate of a proposition a in such a way that the truth of a can be “read off” from these numbers. Apparently Leibniz was hoping that mankind might once discover the “true” characteristic numbers which would enable one to determine the truth of arbitrary propositions just by mathematical calculations! In the essays of April 1679, however, he pursued only the much more modest goal of defining appropriate arithmetical conditions for determining whether a syllogistic inference is logically valid. This task was guided by the idea that a term composed of concepts A and B gets assigned the product of the numbers assigned to the components:

For example, since ‘man’ is ‘rational animal’, if the number of ‘animal’, a, is 2, and the number of ‘rational’, r, is 3, then the number of ‘man’, m, will be the same as a*r, in this example 2*3 or 6. (LLP, 17).

Now a UA like ‘All gold is metal’ can be understood as maintaining that the concept ‘gold’ contains the concept ‘metal’ (because ‘gold’ can be defined as ‘the heaviest metal’). Therefore it seems obvious to postulate that in general ‘Every S is P’ is true if and only if s, the characteristic number assigned to S, contains p, the number assigned to P, as a prime factor; or, in other words, s must be divisible by p. In a first approach, Leibniz thought that the truth-conditions for the particular proposition ‘Some S are P’ might be construed similarly by requiring that either s can be divided by p or conversely p can be divided by s. But this was mistaken. After some trials and errors, Leibniz found the following more complicated solution:

(i) To every term T, a pair of natural numbers <+t₁;-t₂> is assigned such that t₁ and t₂ are relatively prime, that is, they don’t have a common divisor.

(ii) The UA ‘Every S is P’ is true (relative to the assignment (i)) if and only if +s₁ is divisible by +p₁ and -s₂ is divisible by -p₂.

(iii) The UN ‘No S is P’ is true if and only if +s₁ and -p₂ have a common divisor or +p₁ and -s₂ have a common divisor.

(iv) The PA ‘Some S is P’ is true if and only if condition (iii) is not satisfied.

(v) The PN ‘Some S isn’t P’ is true if and only if condition (ii) is not satisfied.

(vi) An inference from premises P₁, P₂ to the conclusion C is logically valid if and only if for each assignment of numbers satisfying condition (i), C becomes true whenever both P₁ and P₂ are true.

As was shown by Lukasiewicz (1951), this semantics satisfies the simple inferences of opposition, subalternation, and conversion, as well as all (and only) the syllogisms which are commonly regarded as valid. Leibniz tried to generalize this semantics for the entire algebra of concepts, but he never found a way to cope with negative concepts. This problem has only been solved by contemporary logicians; compare Sanchez-Mazas (1979), Sotirov (1999).

c. Linear Diagrams and Euler-circles

In the paper “De Formae Logicae Comprobatione per Linearum ductus” probably written after 1686 (Cout, 292-321), Leibniz elaborated two methods for representing the content of categorical propositions. The UA, for example, ‘Every man is an animal’, can be represented either by two nested circles or by two horizontal lines which symbolize that the extension of B is contained in the extension of C (the subsequent graphics are scans from Cout, 292-295):

In the case of a UN like ‘No man is a stone’, one obtains the following diagrams which symbolize that the extension of B is set-theoretically disjoint from the extension of C:

Similarly, the following circles and lines symbolize that, in the case of a PA like ‘Some men are wise’, the extensions of B and C overlap:

Finally, in the case of a PN like ‘Some men are not ruffians’, the diagrams are meant to symbolize that the extension of B is partially disjoint from the extension of C,that is, that some elements of B are not elements of C:

These diagrams may then be used to check whether a given inference is valid. Thus, for example, the validity of FERIO can be illustrated as follows:

Here the conclusion ‘Some D is not B’ follows from the premises ‘No C is B’ and ‘Some D is C’ because the elements of D which are in C can’t be elements of B. On the other hand, invalid syllogisms as, for example, the mood “AOO” of the Fourth Figure, can be refuted as follows:

As the diagram illustrates, the truth of the premises ‘Every B is C’ and ‘Some C is not D’ is compatible with a situation where the conclusion ‘Some D is not B’ is false, that is, where ‘Every D is B’ is true.

Of course, Leibniz’s diagrams which were re-discovered in the 18^th century among others by Euler (1768) are not without problems. In particular, the circles for the PA and the PN are somewhat inaccurate because they basically visualize one and the same state of affairs, namely that (i) some B are C, and (ii) some B are not C, and also (iii) some C are not B. The need to distinguish between different situations such as ((i) & (ii)) in contrast to ((i) & not (ii)) led to improvements of the method of “Euler-circles” as suggested by Venn (1881), Hamilton (1861), and others. Note, incidentally, that, in the GI, Leibniz himself improved the linear diagrams for the UA, PA and PN by drawing perpendicular lines symbolizing the “maximum”,that is, “the limits beyond which the terms cannot, and within which they can, be extended”. At the same time he used a double horizontal line to symbolize “the minimum, that is, that which cannot be taken away without affecting the relation of the terms” (LLP, 73-4, fn. 2).

3. Works on the Universal Calculus

In the period between, roughly, 1679 and 1690, Leibniz spent much effort to generalize the traditional logic to a “Universal Calculus”. At least three different calculi may be distinguished:

(a) The algebra of concepts which is provably equivalent to the Boolean algebra of sets;

(b) A fragmentary quantificational system in which the quantifiers range over concepts but in which quantification over individuals may be introduced by definition;

(c) The so-called “Plus-Minus-calculus” which constitutes an abstract system of “real addition” and “subtraction”. When this calculus is applied to concepts, it yields a weaker logic than the full algebra (a).

a. The Algebra of Concepts L1

The algebra of concepts grows out of the syllogistic framework by three achievements. First, Leibniz drops the informal quantifier expression ‘every’ and formulates the UA simply as “A is B” or, equivalently, as “A contains B”. This fundamental proposition shall here be symbolized as A∈B while its negation will be abbreviated as A∉B. Second, Leibniz introduces an operator of conceptual conjunction which combines two concepts A and B into AB (sometimes also written as “A+B”). Third, Leibniz allows the unrestricted use of conceptual negation which shall here be symbolized as ~A (“Not-A”). Hence, in particular, one can form the inconsistent concept A~A (“A Not-A”) and its tautological counterpart ~(A~A).

Identity or coincidence of concepts might be defined as mutual containment:

DEF 1 (A = B) =_df (A∈B) ∧ (B∈A).

Alternatively, the algebra of concepts can be built up with ‘=’ as a primitive operator while ‘∈’ is defined by:

DEF 2 (A∈B) =_df (A = AB).

Another important operator may be introduced by definition. Concept B is possible if B does not contain a contradiction like A~A:

DEF 3 P(B) =_df (B∉A~A).

Leibniz uses many different locutions to express the self-consistency of a concept A. Instead of ‘A est possibile’ he often says ‘A est res’, ‘A est ens’; or simply ‘A est’. In the opposite case of an impossible concept he also calls A a “false term” (“terminus falsus”).

Identity can be axiomatized by the law of reflexivity in conjunction with the rule of substitutivity:

IDEN 1 A = A

IDEN 2 If A = B, then α[A] ↔ α[B].

By means of these principles, one easily derives the following corollaries:

IDEN 3 A = B → B = A

IDEN 4 A = B ∧ B = C → A = C

IDEN 5 A = B → ~A = ~B

IDEN 6 A = B → AC = BC.

The following laws express the reflexivity and the transitivity of the containment relation:

CONT 1 A∈A

CONT 2 A∈B ∧ B∈C → A∈C.

The most fundamental principle for the operator of conceptual conjunction says: “That A contains B and A contains C is the same as that A contains BC” (LLP, 58, fn. 4), that is,

CONJ 1 A∈BC ↔ A∈B ∧ A∈C.

Conjunction then satisfies the following laws:

CONJ 2 AA = A

CONJ 3 AB = BA

CONJ 4 AB∈A

CONJ 5 AB∈B.

The next operator is conceptual negation, ‘not’. Leibniz had serious problems with finding the proper laws governing this operator. From the tradition, he knew little more than the “law of double negation”:

NEG 1 ~~A = A

One important step towards a complete theory of conceptual negation was to transform the informal principle of contraposition, ‘Every A is B, therefore Every Not-B is Not-A’ into the following principle:

NEG 2 A∈B ↔ ~B∈~A.

Furthermore Leibniz discovered various variants of the “law of consistency”:

NEG 3 A ≠ ~A

NEG 4 A = B → A ≠ ~B.

NEG 5* A∉~A

NEG 6* A∈B → A∉~B.

In the GI, these principles are formulated as follows: “A proposition false in itself is ‘A coincides with Not-A’” (§ 11); “If A = B, then A ≠ Not-B” (§ 171); “It is false that B contains Not-B, that is, B doesn’t contain Not-B” (§ 43); and “A is B, therefore A isn’t Not-B” (§ 91).

Principles NEG 5* and NEG 6* have been marked with a ‘*’ in order to indicate that the laws as stated by Leibniz are not absolutely valid but have to be restricted to self-consistent terms:

NEG 5 P(A) → A∉~A

NEG 6 P(A) → (A∈B → A∉~B).

The following two laws describe some characteristic relations between the possibility-operator P and the other operators of L1:

POSS 1 A∈B ∧ P(A) → P(B)

POSS 2 A∈B ↔ ¬P(A~B).

All these principles have been discovered by Leibniz himself who thus provided an almost complete axiomatization of L1. As a matter of fact, the “intensional” algebra of concept can be proven to be equivalent to Boole’s extensional algebra of sets provided that one adds the following counterpart of the “ex contradictorio quodlibet”:

NEG 7 (A~A)∈B.

As regards the relation of conceptual containment, A∈B, it is important to observe that Leibniz’s standard formulation ‘A contains B’ expresses the so-called “intensional” view of concepts as ideas, while we here want to develop an extensional interpretation in terms of the sets of individuals that fall under the concepts. Leibniz explained the mutual relationship between the “intensional” and the extensional point of view in the following passage from the “New Essays on Human understanding”:

The common manner of statement concerns individuals, whereas Aristotle’s refers rather to ideas or universals. For when I say Every man is an animal I mean that all the men are included among all the animals; but at the same time I mean that the idea of animal is included in the idea of man. ‘Animal’ comprises more individuals than ‘man’ does, but ‘man’ comprises more ideas or more attributes: one has more instances, the other more degrees of reality; one has the greater extension, the other the greater intension. (NE, Book IV, ch. XVII, § 8; compare the original French version in GP 5, 469).

If ‘Int(A)’ and ‘Ext(A)’ abbreviate the “intension” and the extension of a concept A, respectively, then the so-called law of reciprocity can be formalized as follows:

RECI Int(A) ⊆ Int (B) ↔ Ext(A) ⊇ Ext(B).

From this it immediately follows that two concepts A, B have the same “intension” iff they have the same extension. This somewhat surprising result might seem to unveil an inadequacy of Leibniz’s conception. However, “intensionality” in the sense of traditional logic must not be mixed up with intensionality in the modern sense. Furthermore, in Leibniz’s view, the extension of a concept A is not just the set of actually existing individuals, but rather the set of all possible individuals that fall under concept A. Therefore one may define the concept of an extensional interpretation of L1 in accordance with Leibniz’s ideas as follows:

DEF 4 Let U be a non-empty set (the domain of all possible individuals), and let ϕ be a function such that ϕ(A) ⊆ U for each concept-letter A. Then ϕ is an extensional interpretation of L1 if and only if:

(1) ϕ(A∈B) = true iff ϕ(A) ⊆ ϕ(B);

(2) ϕ(A=B) = true iff ϕ(A) = ϕ(B);

(3) ϕ(AB) = ϕ(A) ∩ ϕ(B);

(4) ϕ(~A) = complement of ϕ(A);

(5) ϕ(P(A)) = true iff ϕ(A) ≠ ∅.

Conditions (1) and (2) are straightforward consequences of RECI. Condition (3) also is trivial since it expresses that an individual x belongs to the extension of AB just in case that x belongs to the extension of both concepts (and hence to their intersection). According to condition (4), the extension of the negative concept ~A is just the set of all individuals which do not fall under the concept A. Condition (5) says that a concept A is possible if and only if it has a non-empty extension.

At first sight, this requirement appears inadequate, since there are certain concepts – such as that of a unicorn – which happen to be empty but which may nevertheless be regarded as possible, that is, not involving a contradiction. However, the universe of discourse underlying the extensional interpretation of L1 does not consist of actually existing objects only, but instead comprises all possible individuals. Therefore the non-emptiness of the extension of A is both necessary and sufficient for guaranteeing the self-consistency of A. Clearly, if A is possible, then there must be at least one possible individual x that falls under concept A.

It has often been noted that Leibniz’s logic of concepts lacks the operator of disjunction. Although this is by and large correct, it doesn’t imply any defect or any incompleteness of the system L1 because the operator A∨B may simply be introduced by definition:

DISJ 1 A∨B =_df ~(~A ~B).

On the background of the above axioms of negation and conjunction, the standard laws for disjunction, for example

DISJ 2 A∈(A∨B)

DISJ 3 B∈(A∨B)

DISJ 4 A∈C ∧ B∈C → (A∨B)∈C,

then become provable (Lenzen (1984)).

b. The Quantificational System L2

Leibniz’s quantifier logic L2 emerges from L1 by the introduction of so-called “indefinite concepts”. These concepts are symbolized by letters from the end of the alphabet X, Y, Z …, and they function as quantifiers ranging over concepts. Thus, in the GI, Leibniz explains:

(16) An affirmative proposition is ‘A is B’ or ‘A contains B’ […]. That is, if we substitute the value for A, one obtains ‘A coincides with BY’. For example, ‘Man is an animal’, that is, ‘Man’ is the same as ‘a … animal’ (namely, ‘Man’ is ‘rational animal’). For by the sign ‘Y’ I mean something undetermined, so that ‘BY’ is the same as ‘Some B’, or ‘A … animal’ […], or ‘A certain animal’. So ‘A is B’ is the same as ‘A coincides with some B’, that is, ‘A = BY’.

With the help of the modern symbol for the existential quantifier, the latter law can be expressed more precisely as follows:

CONT 3 A∈B ↔ ∃Y(A = BY).

As Leibniz himself noted, the formalization of the UA according to CONT 3 is provably equivalent to the simpler representation according to DEF 2:

It is noteworthy that for ‘A = BY’ one can also say ‘A = AB’ so that there is no need to introduce a new letter. (Cout, 366; compare also LLP, 56, fn. 1.)

On the one hand, according to the rule of existential generalization,

EXIST 1 If α[A], then ∃Yα[Y],

A = AB immediately entails ∃Y(A = YB). On the other hand, if there exists some Y such that A = YB, then according to IDEN 6, AB = YBB, that is, AB = YB and hence (by the premise A = YB) AB = A. (This proof incidentally was given by Leibniz himself in the important paper “Primaria Calculi Logic Fundamenta” of August 1690; Cout, 235).

Next observe that Leibniz often used to formalize the PA ‘Some A is B’ by means of the indefinite concept Y as ‘YA∈B’. In view of CONT 3, this representation might be transformed into the (elliptic) equation YA = ZB. However, both formalizations are somewhat inadequate because they are easily seen to be theorems of L2! According to CONJ 4, BA contains B, hence by EXIST 1:

CONJ 6 ∃Y(YA∈B).

Similarly, since, according to CONJ 3, AB = BA, a twofold application of EXIST 1 yields:

CONJ 7 ∃Y∃Z(YA = BZ).

These tautologies, of course, cannot adequately represent the PA which for an appropriate choice of concepts A and B may become false! In order to resolve these difficulties, consider a draft of a calculus probably written between 1686 and 1690 (compare Cout, 259-261, and the text-critical edition in AE, VI, 4, # 171), where Leibniz proved principle:

NEG 8* A∉B ↔ ∃Y(YA∈~B).

On the one hand, it is interesting to see that after first formulating the right hand side of the equivalence, “as usual”, in the elliptic way ‘YA is Not-B’, Leibniz later paraphrased it by means of the explicit quantifier expression “there exists a Y such that YA is Not-B”. On the other hand, Leibniz discovered that NEG 8* has to be improved by requiring more exactly that there exists a Y such that YA contains ~B and YA is possible, that is, Y must be compatible with A:

NEG 8 A∉B ↔ ∃Y(P(YA) ∧ YA∈~B).

Leibniz’s proof of this important law is quite remarkable:

(18) […] to say ‘A isn’t B’ is the same as to say ‘there exists a Y such that YA is Not-B’. If ‘A is B’ is false, then ‘A Not-B’ is possible by [POSS 2]. ‘Not-B’ shall be called ‘Y’. Hence YA is possible. Hence YA is Not-B. Therefore we have shown that, if it is false that A is B, then QA is Not-B. Conversely, let us show that if QA is Not-B, ‘A is B’ is false. For if ‘A is B’ would be true, ‘B’ could be substituted for ‘A’ and we would obtain ‘QB is Not-B’ which is absurd. (Cout, 261)

To conclude the sketch of L2, let us consider some of the rare passages where an indefinite concept functions as a universal quantifier. In the above quoted draft (Cout, 260), Leibniz put forward principle “(15) ‘A is B’ is the same as ‘If L is A, it follows that L is B’”:

CONT 4 A∈B ↔ ∀Y(Y∈A → Y∈B).

Furthermore, in § 32 GI, Leibniz at least vaguely recognized that just as A∈B (according to CONJ 6) is equivalent to ∃Y(A = YB), so the negation A∉B means that, for any indefinite concept Y, A ≠ BY:

CONT 5 A∉B ↔ ∀Y(A ≠ YB).

According to AE, VI, 4, 753, Leibniz had written: “(32) Propositio Negativa. A non continet B, seu A esse (continere) B falsum est, seu A non coincidit BY”. Unfortunately, the last passage ‘seu A non coincidit BY’ had been overlooked by Couturat and it is therefore also missing in Parkinson’s translation in LLP! Anyway, with the help of ‘∀’, one can formalize Leibniz’s conception of individual concepts as maximally-consistent concepts as follows:

IND 1 Ind(A) ↔_df P(A) ∧ ∀Y(P(AY) → A∈Y).

Thus A is an individual concept iff A is “self-consistent and A contains every concept Y which is compatible with A. The underlying idea of the completeness of individual concepts had been formulated in § 72 GI as follows:

So if BY is [“being”], and the indefinite term Y is superfluous, that is, in the way that ‘a certain Alexander the Great’ and ‘Alexander the Great’ are the same, then B is an individual. If the term BA is [“being”] and if B is an individual, then A will be superfluous; or if BA=C, then B=C (LLP 65, § 72 + fn. 1; for a closer interpretation of this idea, see Lenzen (2004c)).

Note, incidentally, that IND 1 might be simplified by requiring that, for each concept Y, A either contains Y or contains ~Y:

IND 2 Ind(A) ↔ ∀Y(A∈~Y ↔ A∉Y).

As a corollary it follows that the invalid principle

NEG 9* A∉B → A∈~B,

which Leibniz again and again had considered as valid, in fact holds only for individual concepts:

NEG 9 Ind(A) → (A∉B → A∈~B).

Already in the “Calculi Universalis Investigationes” of 1679, Leibniz had pointed out:

…If two propositions are given with exactly the same singular [!] subject, where the predicate of the one is contradictory to the predicate of the other, then necessarily one proposition is true and the other is false. But I say: exactly the same [singular] subject, for example, ‘This gold is a metal’, ‘This gold is a not-metal.’ (AE VI, 4, 217-218).

The crucial issue here is that NEG 9* holds only for an individual concept like, for example, ‘Apostle Peter’, but not for general concepts as, for example, ‘man’. The text-critical apparatus of AE reveals that Leibniz was somewhat diffident about this decisive point. He began to illustrate the above rule by the correct example “if I say ‘Apostle Peter was a Roman bishop’, and ‘Apostle Peter was not a Roman bishop’” and then went on, erroneously, to generalize this law for arbitrary terms: “or if I say ‘Every man is learned’ ‘Every man is not learned’.” Finally he noticed this error “Here it becomes evident that I am mistaken, for this rule is not valid.” The long story of Leibniz’s cardinal mistake of mixing up ‘A isn’t B’ and ‘A is not-B’ is analyzed in detail in Lenzen (1986).

There are many different ways to represent the categorical forms by formulas of L1 or L2. The most straightforward formalization would be the following “homogenous” schema in terms of conceptual containment:

UA A∈B UN A∈~B

PA A∉~B PN A∉B.

The “homogeneity” consists in two facts:

(a) The formula for the UN is obtained from that of the UA by replacing the predicate B with its negation, ~B. This is the formal counterpart of the traditional principle of obversion according to which, for example, ‘No A is B’ is equivalent to ‘Every A is not-B’.

(b) In accordance with the traditional laws of opposition, the formulas for the particular propositions are just taken as the negations of corresponding universal propositions.

In view of DEF 2, the first schema may be transformed into

UA A = AB UN A = A~B

PA A ≠ A~B PN A ≠ AB.

Similarly, by means of the fundamental law POSS 2, one obtains

UA ¬P(A~B) UN ¬P(AB)

PA P(AB) PN P(A~B).

Furthermore, with the help of indefinite concepts, one can formulate, for example,

UA ∃Y(A = YB) UN ∃Y(A = Y~B)

PA ∀Y(A ≠ Y~B) PN ∀Y(A ≠ YB).

Leibniz used to work with various elements of these representations, often combining them into complicated inhomogeneous schemata such as:

“A = YB is the UA, where the adjunct Y is like an additional unknown term: ‘Every man’ is the same as ‘A certain animal’.

YA = ZB is the PA. ‘Some man’ or ‘Man of a certain kind’ is the same as ‘A certain learned’.

A = Y not-B [is the UN] No man is a stone, that is, Every man is a not-stone, that is, ‘Man’ and ‘A certain not-stone’ coincide.

YA = Z not-B [is the PN] A certain man isn’t learned or is not-learned, that is, ‘A certain man’ and ‘A certain not-learned’ coincide” (Cout, 233-234).

But the representations of PA and PN of this schema are inadequate because the formulas ‘[∃Y∃Z](YA = ZB)’ and ‘[∃Y∃Z](YA = Z~B)’ are theorems of L2! These conditions may, however, easily be corrected by adding the requirement that YA is self-consistent:

UA ∃Y(A = YB) UN ∃Y(A = Y~B)

PA ∃Y∃Z(P(YA) ∧ YA = ZB) PN ∃Y∃Z(P(YA) ∧ YA = Z~B).

Already in the paper “De Formae Logicae Comprobatione per Linearum ductus”, Leibniz had made numerous attempts to prove the basic laws of syllogistic with the help of these schemata. He continued these efforts in two interesting fragments of August 1690 dealing with “The Primary Bases of a Logical Calculus” (LLP, 90 – 92 + 93-94; compare also the closely related essays “Principia Calculi rationalis” in Cout, 229-231 and the untitled fragments Cout, 259-261 + 261-264). In the end, however, Leibniz remained unsatisfied with his attempts.

To be sure, a complete proof of the theory of the syllogism could easily be obtained by drawing upon the full list of “axioms” for L1 and L2 as stated above. But Leibniz more ambitiously tried to find proofs which presuppose only a small number of “self-evident” laws for identity. In particular, he was not willing to adopt principle

(17) Not-B = not-B not-(AB), that is, Not-B contains Not-AB, or Not-B is not-AB

as a fundamental axiom which therefore needs not itself be demonstrated. Although Leibniz realized that (17) is equivalent to the law of contraposition repeated in the subsequent §

(19) ‘A = AB’ and ‘Not-B = Not-B Not-A’ are equivalent. This is conversion by contraposition (Cout, 422),

he still thought it necessary to prove this “axiom”: “This remains to be demonstrated in our calculus”!

c. The Plus-Minus-Calculus

The so-called Plus-Minus-Calculus was mainly developed in the paper “Non inelegans specimen demonstrandi in abstractis” of around 1686/7 (compare GP 7, ## XIX, XX and the text-critical edition in AE VI, 4, ## 177, 178; English translations are provided in LLP, 122-130 + 131-144). Strictly speaking, the Plus-Minus-Calculus is not a logical calculus but rather a much more general calculus which admits of different applications and interpretations. In its abstract form, it should be regarded as a theory of set-theoretical containment, set-theoretical “addition”, and set-theoretical “subtraction”. Unlike modern systems of set-theory, however, Leibniz’s calculus has no counterpart of the relation ‘x is an element of A’; and it also lacks the operator of set-theoretical “negation”, that is, set-theoretical complement! The complement of set A might, though, be defined with the help of the subtraction operator as (U-A) where the constant ‘U’ designates the universe of discourse. But, in Leibniz’s calculus, this additional logical element is lacking.

Leibniz’s drafts exhibit certain inconsistencies which result from the experimental character of developing the laws for “real” addition and subtraction in close analogy to the laws of arithmetical addition and subtraction. The genesis of this idea is described in detail in Lenzen (1989). The inconsistencies might be removed basically in two ways. First, one might restrict A-B to the case where B is contained in A; such a conservative reconstruction of the Plus-Minus-Calculus has been developed in Dürr (1930). The second, more rewarding alternative consists in admitting the operation of “real subtraction” A-B also if B is not contained in A. In any case, however, one has to give up Leibniz’s idea that subtraction might yield “privative” entities which are “less than nothing”.

In the following reconstruction, Leibniz’s symbols ‘+’ for the addition (that is, union) and ‘-’ for the subtraction of sets are adopted, while his informal expressions ‘Nothing’ (“nihil”) and ‘is in’ (“est in”) are replaced by the modern symbols ‘∅’ and ‘⊆’. Set-theoretical identity may be treated either as a primitive or as a defined operator. In the former case, inclusion can be defined either by A⊆B =_df ∃Y(A+Y = B) or simpler as A⊆B =_df (A+B = B). If, conversely, inclusion is taken as primitive, identity can be defined as mutual inclusion: A=B =_df (A⊆B) ∧ (B⊆A) (see, for example, Definition 3, Propositions 13 +14 and Proposition 17 in LLP, 131-144).

Set-theoretical addition is symmetric, or, as Leibniz puts it, “transposition makes no difference here” (LLP, 132):

PLUS 1 A+B = B+A.

The main difference between arithmetical addition and “real addition” is that the addition of one and the same “real” thing (or set of things) doesn’t yield anything new:

PLUS 2 A+A = A.

As Leibniz puts it (LLP, 132): “A+A = A […] that is, repetition changes nothing. (For although four coins and another four coins are eight coins, four coins and the same four already counted are not)”.

The “real nothing”, that is, the empty set ∅, is characterized as follows: “It does not matter whether Nothing is put or not, that is, A+Nih. = A” (Cout, 267):

NIHIL 1 A+∅ = A.

In view of the relation (A⊆B) ↔ (A+B = B), this law can be transformed into:

NIHIL 2 ∅⊆A.

“Real” subtraction may be regarded as the converse operation of addition: “If the same is put and taken away […] it coincides with Nothing. That is, A […] – A […] = N” (LLP, 124, Axiom 2):

MINUS 1 A-A = ∅.

Leibniz also considered the following principles which in a stronger form express that subtraction is the converse of addition:

MINUS 2* (A+B)-B = A

MINUS 3* (A+B) = C → C-B = A.

But he soon recognized that these laws do not hold in general but only in the special case where the sets A and B are “uncommunicating” (Cout, 267, # 29: “Therefore if A+B = C, then A = C-B […] but it is necessary that A and B have nothing in common”.) The new operator of “communicating” sets has to be understood as follows:

If some term, M, is in A, and the same term is in B, this term is said to be ‘common’ to them, and they will be said to be ‘communicating’. (LLP, 123, Definition 4)

Hence two sets A and B have something in common if and only if there exists some set Y such that Y⊆A and Y⊆B. Now since, trivially, the empty set is included in every set A (NIHIL 2), one has to add the qualification that Y is not empty:

COMMON 1 Com(A,B) ↔_df ∃Y(Y≠∅ ∧ Y⊆A ∧ Y⊆B).

The necessary restriction of MINUS 2* and MINUS 3* can then be formalized as follows:

MINUS 2 ¬Com(A,B) → ((A+B)-B = A)

MINUS 3 ¬Com(A,B) ∧ (A+B = C) → (C-B = A).

Similarly, Leibniz recognized (LLP, 130) that from an equation A+B = A+C, A may be subtracted on both sides provided that C is “uncommunicating” both with A and with B, that is,

MINUS 4 ¬Com(A,B) ∧ ¬Com(A,C) → (A+B = A+C → B=C).

Furthermore Leibniz discovered that the implication in MINUS 2 may be converted (and hence strengthened into a biconditional). Thus one obtains the following criterion: Two sets A, B are “uncommunicating” if and only if the result of first adding and then subtracting B coincides with A. Inserting negations on both sides of this equivalence one obtains:

COMMON 2 Com(A,B) ↔ ((A+B)-B) ≠ A.

Whenever two sets A, B are communicating or “have something in common”, the intersection of A and B, in modern symbols A∩B, is not empty (LLP, 127, Case 2 of Theorem IX: “Let us assume meanwhile that E is everything which A and G have in common – if they have something in common, so that if they have nothing in common, E = Nothing”), that is,

COMMON 3 Com(A,B) ↔ A∩B ≠ ∅.

Furthermore, “What has been subtracted and the remainder are uncommunicating” (LLP, 128, Theorem X), that is,

COMMON 4 ¬Com(A-B,B).

Leibniz further discovered the following formula which allows one to “calculate” the intersection or “commune” of A and B by a series of additions and subtractions: A∩B = B-((A+B)-A). In a small fragment (Cout, 250) he explained:

Suppose you have A and B and you want to know if there exists some M which is in both of them. Solution: combine those two into one, A+B, which shall be called L […] and from L one of the constituents, A, shall be subtracted […] let the rest be N; then, if N coincides with the other constituent, B, they have nothing in common. But if they do not coincide, they have something in common which can be found by subtracting the rest N […] from B […] and there remains M, the commune of A and B, which was looked for.

4. Leibniz’s Calculus of Strict Implication

It is a characteristic feature of Leibniz’s logic that when he states and proves the laws of concept logic, he takes the requisite rules and laws of propositional logic for granted. Once the former have been established, however, the latter can be obtained from the former by observing a strict analogy between concepts and propositions which allows one to re-interpret the conceptual connectives as propositional connectives. Note, incidentally, that in the 19^th century George Boole in roughly the same way first presupposed propositional logic to develop his algebra of sets, and only afterwards derived the propositional calculus out of the set-theoretical calculus. While Boole thus arrived at the classical, two-valued propositional calculus, Leibniz’s approach instead yields a modal logic of strict implication.

Leibniz outlined a simple, ingenious method to transform the algebra of concepts into an algebra of propositions. Already in the “Notationes Generales” written between 1683 and 1685 (AE VI, 4, # 131), he pointed out to the parallel between the containment relation among concepts and the implication relation among propositions. Just as the simple proposition ‘A is B’ is true, “when the predicate [A] is contained in the subject” B, so a conditional proposition ‘If A is B, then C is D’ is true, “when the consequent is contained in the antecedent” (AE VI, 4, 551). In later works Leibniz compressed this idea into formulations such as “a proposition is true whose predicate is contained in the subject or more generally whose consequent is contained in the antecedent” (Cout, 401). The most detailed explanation of this idea was given in §§ 75, 137 and 189 of the GI:

If, as I hope, I can conceive all propositions as terms, and hypotheticals as categoricals and if I can treat all propositions universally, this promises a wonderful ease in my symbolism and analysis of concepts, and will be a discovery of the greatest importance […]

We have, then, discovered many secrets of great importance for the analysis of all our thoughts and for the discovery and proof of truths. We have discovered […] how absolute and hypothetical truths have one and the same laws and are contained in the same general theorems […]

Our principles, therefore, will be these […] Sixth, whatever is said of a term which contains a term can also be said of a proposition from which another proposition follows (LLP, 66, 78, and 85).

To conceive all propositions in analogy to concepts means in particular that the conditional ‘If a then b’ will be logically treated like the containment relation between concepts, ‘A contains B’. Furthermore, as Leibniz explained elsewhere, negations and conjunctions of propositions are to be conceived just as negations and conjunctions of concepts. Thus one obtains the following mapping of the primitive formulas of the algebra of concepts into formulas of the algebra of propositions:

A∈B α → β

A=B α ↔ β

~A ¬α

AB α∧β

P(A) ◊α

As Leibniz himself explained, the fundamental law POSS 2 does not only hold for the containment-relation between concepts but also for the entailment relation between propositions:

‘A contains B’ is a true proposition if ‘A non-B’ entails a contradiction. This applies both to categorical and to hypothetical propositions (Cout, 407).

Hence A∈B ↔ ¬P(A~B) may be “translated” into (α→β) ↔ ¬◊(α∧¬β). This formula unmistakably shows that Leibniz’s conditional is not a material but rather a strict implication. As Rescher already noted in (1954: 10), Leibniz’s account provides a definition of “entailment in terms of negation, conjunction, and the notion of possibility”, which coincides with the modern definition of strict implication put forward, for example, in Lewis & Langford (1932: 124): “The relation of strict implication can be defined in terms of negation, possibility, and product […] Thus ‘p implies q’ […] is to mean ‘It is false that it is possible that p should be true and q false’”. This definition is almost identical with Leibniz’s explanation in “Analysis Particularum”: “Thus if I say ‘If L is true it follows that M is true’, this means that one cannot suppose at the same time that L is true and that M is false” (AE VI, 4, 656).

Given the above “translation”, the basic axioms and theorems of the algebra of concepts can be transformed into the following laws of the algebra of propositions:

IMPL 1 α → α

IMPL 2 (α → β) ∧ (β→γ) → (α→γ)

IMPL 3 (α → β) ↔ (α ↔ α∧β)

CONJ 1 (α → β∧γ) ↔ ((α→β) ∧ (α→γ))

CONJ 2 α∧β → α

CONJ 3 α∧β → β

CONJ 4 α∧α ↔ α

CONJ 5 α∧β ↔ β∧α

NEG 1 ¬¬α ↔ α

NEG 2 ¬(α ↔ ¬α)

NEG 3 (α → β) ↔ (¬β→ ¬α)

NEG 4 ¬α → ¬(α∧β)

NEG 5 ◊α → ((α → β) → ¬(α → ¬β))

NEG 6 (α ∧¬α) → β

POSS 1 (α → β) ∧ ◊α → ◊β

POSS 2 (α → β) ↔ ¬◊(α ∧ ¬β)

POSS 3 ¬◊(α ∧ ¬α)

5. Works on Modal Logic

When people credit Leibniz with having anticipated “Possible-worlds-semantics”, they mostly refer to his philosophical writings, in particular to the “Nouveaux Essais sur l’entendement humain” (NE) and to the metaphysical speculations of the “Essais de theodicée” (Theo) of 1710. Leibniz argues there that while there are infinitely many ways how God might have created the world, the real world that God finally decided to create is the best of all possible worlds. As a matter of fact, however, Leibniz has much more to offer than this over-optimistic idea (which was rightly criticized by Voltaire and, for example, in part 2 of chapter 8 of Hume’s “An Enquiry concerning Human Understanding”). In what follows we briefly consider some of Leibniz’s early logical works where

(1) the idea that a necessary proposition is true in each possible world (while a possible proposition is true in at least one possible world) is formally elaborated, and where

(2) the close relation between alethic and deontic modalities is unveiled.

a. Possible-Worlds-Semantics for Alethic Modalities

The fundamental logical relations between necessity, ☐, possibility, ◊, and impossibility can be expressed, for example, by:

NEC 1 ☐(α) ↔ ¬◊(¬α)

NEC 2 ¬◊(α) ↔ ☐(¬α).

These laws were familiar already to logicians long before Leibniz. However, Leibniz “proved” these relations by means of an admirably clear analysis of modal operators in terms of “possible cases”, that is, possible worlds:

Possible is whatever can happen or what is true in some cases

Impossible is whatever cannot happen or what is true in no […] case

Necessary is whatever cannot not happen or what is true in every […] case

Contingent is whatever can not happen or what is [not] true in some case. (AE VI, 1, 466).

As this quotation shows, Leibniz uses the notion of contingency not in the modern sense of ‘neither necessary nor impossible’ but as the simple negation of ‘necessary’. The quoted analysis of the truth-conditions for modal propositions entails the validity not only of NEC 1, 2, but also of:

NEC 3 ☐α → ◊(α)

NEC 4 ¬◊(α) → ¬☐(α).

Leibniz “proves” these laws by reducing them to corresponding laws for quantifiers such as: If α is true in each case, then α is true in at least one case. In the “Modalia et Elementa Juris Naturalis” of around 1679, Leibniz mentions NEC 3 and NEC 4 in passing: “Since everything which is necessary is possible, so everything that is impossible is contingent, that is, can fail to happen” (AE IV, 4, 2759). A very elliptic “proof” of these laws was already sketched in the “Elementa juris naturalis” of 1669/70 (AE VI, 1, 469).

It cannot be overlooked, however, that Leibniz’s semi-formal truth conditions, even when combined with his later views on possible worlds, fail to come up to the standards of modern possible worlds semantics, since nothing in Leibniz’s considerations corresponds to an accessibility relation among worlds.

b. Basic Principles of Deontic Logic

As has already been pointed out by Schepers (1972) and Kalinowski (1974), Leibniz saw very clearly that the logical relations between the deontic modalities obligatory, permitted and forbidden exactly mirror the corresponding relations between necessary, possible and impossible, and that therefore all laws and rules of alethic modal logic may be applied to deontic logic as well.

Just like ‘necessary’, ‘contingent’, ‘possible’ and ‘impossible’ are related to each other, so also are ‘obligatory’, ‘not obligatory’, ‘permitted’, and ‘forbidden’ (AE VI, 4, 2762).

This structural analogy goes hand in hand with the important discovery that the deontic notions can be defined by means of the alethic notions plus the additional “logical” constant of a morally perfect man (“vir bonus”). Such a virtuous man is characterized by the requirements that he strictly obeys all laws, always acts in such a way that he does no harm to anybody, and is benevolent to all other people. Given this understanding of a “vir bonus”, Leibniz explains:

Obligatory is what is necessary for the virtuous man as such.

Not obligatory is what is contingent for the virtuous man as such.

Permitted is what is possible for the virtuous man as such.

Forbidden is what is impossible for the virtuous man as such (Grua, 605).

If we express the restriction of the modal operators ☐ and ◊ to the virtuous man by means of a subscript ‘v’, these definitions can be formalized as follows (where the letter ‘E’ reminding of the German notion ‘erlaubt’ is taken instead of ‘P’ for ‘permitted’ in order to avoid confusions with the operator of possibility):

DEON 1 O(α) ↔ ☐_v(α)

DEON 2 E(α) ↔ ◊_v(α)

DEON 3 F(α) ↔ ¬◊_v(α).

Now, as Leibniz mentioned in passing, all that is unconditionally necessary will also be necessary for the virtuous man:

NEC 5 ☐(α) → ☐_v(α).

Hence (as was shown in more detail in Lenzen (2005)), Leibniz’s derivation of the fundamental laws for the deontic operators from corresponding laws of the alethic modal operators proceeds in much the same way as the modern reduction of deontic logic to alethic modal logic “rediscovered” almost 300 years after Leibniz by Anderson (1958).

6. References and Further Reading

a. Abbreviations for Leibniz’s works

AE German Academy of Science (ed.), G. W. Leibniz, Sämtliche Schriften und Briefe, Series VI, „Philosophische Schriften“, Darmstadt 1930, Berlin 1962 ff.
Cout Louis Couturat (ed.), Opuscules et fragments inédits de Leibniz, Paris (Presses universitaires de France) 1903, reprint Hildesheim (Olms) 1961.
GI Generales Inquisitiones de Analysi Notionum et Veritatum; first edited in Cout, 356-399; text-critical edition in A, VI 4, 739-788; English translation in LLP, 47-87.
GP C. I. Gerhardt (ed.), Die philosophischen Schriften von G. W. Leibniz, seven volumes Berlin/Halle 1875-90, reprint Hildesheim (Olms) 1965.
Grua Gaston Grua (ed.), G. W. Leibniz – Textes Inédits, two Volumes, Paris (Presses Universitaires de France) 1948.
LH Eduard Bodemann (ed.), Die Leibniz-Handschriften der Königlichen Öffentlichen Bibliothek zu Hannover, Hannover 1895, reprint Hildesheim (Olms) 1966.
LLP G. H. R. Parkinson (ed.), Leibniz Logical Papers – A Selection, Oxford (Clarendon Press), 1966.
NE Nouveaux Essais sur l’entendement humain – Par l’Auteur du Système de l’Harmonie Preestablie, in GP 5, 41-509.
Theo Essais de Theodicée sur la Bonté de Dieu, la Liberté de l’Homme et l’Origine du Mal, in GP 6, 21-436.

b. Secondary Literature

Anderson, Alan Ross (1958): “A Reduction of Deontic Logic to Alethic Modal Logic”, in Mind LXVII, 100-103.
Arnauld, Antoine & Nicole, Pierre (1683) : La Logique ou L’Art de Penser, 5th edition, reprint 1965 Paris (Presses universitaires de France).
Burkhardt, Hans (1980): Logik und Semiotik in der Philosophie von Leibniz, München (Philosophia Verlag).
Couturat, Louis (1901): La Logique de Leibniz d’après des documents inédits, Paris (Félix Alcan).
Dürr, Karl (1930): Neue Beleuchtung einer Theorie von Leibniz – Grundzüge des Logikkalküls, Darmstadt.
Euler, Leonhard (1768): Lettres à une princesse d’Allemagne sur quelques sujets de physique et de philosophie, St Petersburg, 1768–1772.
Hamilton, William (1861): Lectures on Metaphysics and Logic, ed. by H.L. Mansel & J. Veitch, Edinburgh, London (William Blackwood); reprint Stuttgart Bad Cannstadt 1969.
Ishiguro, Hidé (1972): Leibniz’s Philosophy of Logic and Language, London (Duckworth).
Kalinowski, George (1974): “Un logicien déontique avant la lettre: Gottfried Wilhelm Leibniz”, in Archiv für Rechts- und Sozialphilosophie 60, 79-98.
Kauppi, Raili (1960): Über die Leibnizsche Logik mit besonderer Berücksichtigung des Problems der Intension und der Extension, Helsinki (Acta Philosophica Fennica).
Kneale, William and Martha (1962): The Development of Logic, Oxford (Clarendon).
Lenzen, Wolfgang (1984): “Leibniz und die Boolesche Algebra”, in Studia Leibnitiana 16, 187-203.
Lenzen, Wolfgang (1986): “‘Non est’ non est ‘est non’ – Zu Leibnizens Theorie der Negation”, in Studia Leibnitiana 18, 1-37.
Lenzen, Wolfgang (1989): “Arithmetical vs. ‘Real’ Addition – A Case Study of the Relation between Logic, Mathematics, and Metaphysics in Leibniz”, in N. Rescher (ed.), Leibnizian Inquiries – A Group of Essays, Lanham, 149-157.
Lenzen, Wolfgang (1990): Das System der Leibnizschen Logik, Berlin (de Gruyter).
Lenzen, Wolfgang (2004a): Calculus Universalis – Studien zur Logik von G. W. Leibniz, Paderborn (mentis).
Lenzen, Wolfgang (2004b): “Leibniz’s Logic”, in D. Gabbay & J. Woods (eds.) The Rise of Modern Logic – From Leibniz to Frege (Handbook of the History of Logic, vol. 3), Amsterdam (Elsevier), 1-83.
Lenzen, Wolfgang (2004c): “Logical Criteria for Individual(concept)s”, in M. Carrara, A. M. Nunziante & G. Tomasi (eds.), Individuals, Minds, and Bodies: Themes from Leibniz, Stuttgart (Steiner), 87-107.
Lenzen, Wolfgang (2005): “Leibniz on Alethic and Deontic Modal Logic”. In D. Berlioz & F. Nef (eds.), Leibniz et les Puissances du Langage, Paris (Vrin), 2005, 341-362.
Lewis, Clarence I. & Langford, Cooper H. (1932): Symbolic Logic, New York, ²1959 (Dover Publications).
Liske M.-Th. (1994): “Ist eine reine Inhaltslogik möglich? Zu Leibniz’ Begriffstheorie”, in Studia Leibnitiana XXVI, 31-55.
Lukasiewicz, Jan (1951): Aristotle’s Syllogistic – From the Standpoint of Modern Formal Logic, Oxford (Clarendon Press).
Mugnai, Massimo (1992): Leibniz’s Theory of Relations, Stuttgart (Steiner).
Poser, Hans (1969): Zur Theorie der Modalbegriffe bei G. W. Leibniz, Wiesbaden (Steiner).
Rescher, Nicholas (1954): “Leibniz’s interpretation of his logical calculus”, in Journal of Symbolic Logic 19, 1-13.
Rescher, Nicholas (1979): Leibniz – An Introduction to his Philosophy, London (Billing & Sons).
Sanchez-Mazas, Miguel (1979): “Simplification de l’arithmétisation leibnitienne de la syllogistique par l’expression arithmétique de la notion intensionelle du ‘non ens’”, in Studia Leibnitiana Sonderheft 8, 46-58.
Schepers, Heinrich (1972): “Leibniz‘ Disputationen ‚De Conditionibus‘: Ansätze zu einer juristischen Aussagenlogik” in Studia Leibnitiana Supplementa XV, 1-17.
Schupp, Franz (ed.) (1982): G. W. Leibniz, Allgemeine Untersuchungen über die Analyse der Begriffe und Wahrheiten, Hamburg (Meiner).
Schupp, Franz (ed.) (2000): G. W. Leibniz, Die Grundlagen des logischen Kalküls, Hamburg (Meiner).
Swoyer, Chris (1995): “Leibniz on Intension and Extension”, in Noûs 29, 96-114.
Sotirov, Vladimir (1999): “Arithmetizations of Syllogistic à la Leibniz”, in Journal of Applied Non-Classical Logics 9, 387-405.
Venn, John (1881): Symbolic Logic, London (MacMillan).

Author Information

Wolfgang Lenzen
Email: lenzen@uos.de
University of Osnabrück
Germany

Hugo Grotius (1583—1645)

Hugo Grotius was a Dutch humanist and jurist whose philosophy of natural law had a major impact on the development of seventeenth century political thought and on the moral theories of the Enlightenment. Valorized by contemporary international theorists as the father of international law, his work on sovereignty, international rights of commerce and the norms of just war continue to inform theories of the international legal order. His major work, De Jure Belli ac Pacis (The Rights of War and Peace), is particularly notable in this respect, as well as Mare Liberum, a doctrine in favor of the freedom of the seas, which is considered an antecedent, inspiration and the backbone of the modern law of the sea.

Grotius was heavily influenced by classical philosophy, most prominently Aristotle and the Stoics, as well as by the contemporary humanist tradition and the late-medieval Scholastics. Caught up in the religious strife of the Reformation, Grotius promoted an irenic vision that would unite and reconcile the Christian Church on the principles of civil religion and toleration. He was well known in his time as much for his poetry and philosophy of religion as for his work on law and politics but is best remembered for his influence on theories of the social contract, natural rights and the laws of war.

Life and Works
Irenicism and Tolerance
Sovereignty and Imperialism
1. Divisible Sovereignty
2. Resistance, War and Empire
Natural Right and the Law of Nations
Scholarly Interest in Grotius
References and Further Reading
1. Primary Sources
2. Secondary Sources

1. Life and Works

Huig de Groot, best known by the Latinized name Hugo Grotius, began his life in the commercial town of Delft while, in 1583, the Dutch Republic persevered through a second decade of war for independence from Hapsburg rule and was already positioning itself for ascendancy as an overseas trading power. Born into a family with standing among the city elite and connections to the recently founded University of Leiden, young Hugo would find many opportunities to develop his considerable talents for scholarly pursuits even as a child. His family tutored him in Greek and Latin at an early age, introduced him to classical letters, and brought him up in the disciplines of Reformed faith. So outstanding were his gifts for intellectual work that he was welcomed to enroll at Leiden University at the mere age of eleven. At the university, the boy de Groot became a favored student of some of the most celebrated scholars of the time, discovering his talents in a whole range of subjects in the liberal arts and new sciences. His reputation as a promising young man of letters would open a number of doors for him in the political life of the time, where humanist expertise was a valued asset. The most auspicious of these opportunities came as he was preparing for life beyond the university. In 1598, no less a figure than Jan van Oldenbarnevelt, the Grand Pensionary and most influential personality in Dutch politics, invited Grotius to accompany his delegation to the French court. The embassy, which ultimately failed in its aim to renew the king’s military support against Spain, nonetheless brought Grotius into the fold of high politics and even staked him a reputation with the French court when Henry IV lauded the learned youth as “the miracle of Holland.” The connections he made in France enabled Grotius to extend his stay and earn a Doctor of Laws degree from the University of Orléans before returning to Holland the following year.

Entering into practice as a lawyer in The Hague, Grotius took advantage of chances to hone his rhetorical skills and found time to devote to his diverse scholarly interests. His earliest writings to go into print included several imitations of classical verse and translations of significant works in compass navigation and astronomy, the latter being of keen interest to his friends invested in the burgeoning overseas trade. In 1601, he published a tragedy, Adamus Exul (Adam in Exile), that earned him instant acclaim as a poet; it was a work that John Milton would later study in preparing his Paradise Lost. While Grotius prized these pursuits more highly than the mundane work of a lawyer, he always strove to please his patrons and clients. Indeed, his most lasting contributions to political thought took shape in the course of his professional duties during this period.

In 1604, Grotius was drawn into the sensational controversy over privateering in the Southeast Asian trade. The United Dutch East India Company had been rising quickly as a major player in European overseas commerce, and Grotius shared the view of many of his associates involved in the trade that the Company not only buoyed up the young republic with wealth but also weakened its adversaries by cutting into Iberian dominance of the East Indian routes. Still, acts of piracy by a private concern did not sit well in the public opinion of many citizens and allies. When asked by a friend with Company connections to write a brief justifying a recent and very lucrative seizure of Spanish cargo, Grotius went on to produce not only an ardent defense of the capture but an investigation into the deep principles of law that connected those separated by nation and culture. The resulting manuscript, provisionally titled De Indis (On the Indies), was never published in full until long after Grotius’ death (appearing in 1868 as Commentary on the Laws of Prize and Booty). It was the young jurist’s first systematic work on the problems of international affairs and was in many ways his most philosophically developed. Many of the arguments worked out in the manuscript—that there is a basic law of nature determined by the need to reconcile self-preservation with social life, that the authority to govern and even to punish derive from the rights of natural persons prior to the founding of civil societies, and that claims to jurisdiction over the open seas are invalid—would give direction to his later works.

In fact, the last of these arguments would appear in print in 1609 as the anonymous pamphlet, Mare Liberum (The Free Seas). The pamphlet, which Grotius pulled directly from the text of De Indis, once again served the interests of those in the Dutch political and commercial establishment that were insisting upon the right of access to overseas routes in the ongoing negotiations for a truce with the Spanish. The work argued not only that the Spanish claims to a trading monopoly in the Southeast Asia and elsewhere failed to square with the facts—that these were rights conferred by papal authority or acquired by just conquest—but that there was, in principle, no basis for any monopoly on access to the seas. The freedom of the seas was entailed by the very nature of private property. To privately own a thing requires that one can occupy it, taking it out of the common store, and that one can make full use of it. The sea cannot be contained and is too plentiful for its usefulness to be exhausted by a few; hence, no one can take exclusive ownership of the sea. The seas remain open to all. This question was of great importance in European relations during this period of intense competition between aspiring overseas empires, and Grotius’ work would frame the intense debate to follow. During this time in his early legal career, he penned a number of other manuscripts touching on matters of international relations that, while mostly unpublished, shaped his later work on the subject. The Parellelon Rerumpublicarum (composed 1601-2) explored the concept of ‘good faith’ in dealings with other nations through some flattering comparisons among the customs of the Greek, Roman and Dutch peoples. In his Commentary on Eleven Theses (circa 1602-08), Grotius worked out an understanding of the ruling power of a state—its sovereignty—and its relation to the principles of just war.

Having proved the usefulness of his talents to the ruling elite, Grotius’ star continued to rise. He gained recognition from Prince Maurits of Orange, the executive and military leader of the United Provinces, when in 1607, the prince appointed him as attorney general of the provinces of Holland, Zeeland and West Friesland. It was during this time that he became engaged to be married to a young woman from a distinguished family in Zeeland, Maria van Reigersberch. Her partnership and personal courage would carry the family through a tumultuous life that the young couple could not have expected at the time of their wedding in 1608. Soon thereafter, Maria gave birth to the first of seven children. As his focus shifted from legal practice to public service, Grotius began to put a number of his writings into press. His second celebrated tragedy, The Passion of Christ, came out in 1608, followed by the anonymous Mare Liberum in 1609 and a political history of the old Dutch republic, De Antiquitate Reipublicae Batavicae, in 1610. The historical account provided ideological leverage for the position that Holland had persisted in its republican form of government despite the princely claims of the Hapsburgs. The governing States of Holland commissioned Grotius to write a detailed history of the conflict with Spain, which he submitted in 1612. The States declined, likely due to the delicate truce, to publicize the work, leaving the Annales et Historiae de Rebus Belgicus to rest until his sons brought it out posthumously in 1657. Opportunity for higher office came again when, in 1612, the town council of Rotterdam offered Grotius the mayoral position of Pensionary. The title brought with it a seat in the States of Holland where he would collaborate more closely with his mentor, Oldenbarnevelt, and key players in provincial and national politics.

The political controversy that would end up defining Grotius’ tenure in office began with small rumblings when, in 1608, the professor of theology at the University of Leiden, Jacob Arminius, put forth a doctrine that challenged key features of the reigning Calvinist orthodoxy concerning predestination (see below: Irenicism and Tolerance). Calvinist church officials and divines came out strongly against the preaching of such a view. Though Arminius died the following year, the conflict escalated in a way that pitted the church establishment against the civil authorities over the question of who could rule on such doctrinal disputes. Grotius shared with many in the government of Holland some sympathies with the Arminian view but a desire above all to prevent such matters from disturbing the peace. He had been composing, during this time, a manuscript on the idea that all faiths shared a set of core doctrines, a viewpoint capable of promoting a certain equanimity towards squabbles over the finer points of theology. This was in any case the political attitude Grotius favored, and while he never published the Meletius manuscript, he developed several writings on the role of the state in managing conflicts over religion. The pamphlet, Ordinum Hollandiae et Westfrisiae pietas (1613), defended the ‘piety’ of the governments of Holland and Westfriesland in imposing a policy of toleration that allowed Arminians to preach their dissenting doctrine. Grotius himself had drafted the policy, which failed in its aim of mollifying the factions and, in fact, heightened the conflict between the civil and ecclesiastical authorities. Convinced that the practice of religion was a concern proper to civil magistrates, Grotius set about justifying his views in a longer treatise. De Imperio Summarum Potestatum circa Sacra argued that, to avoid a conflict of rights, there must be only one final authority within a state on how religion is to be practiced, that because of its mandate to keep civil peace and form responsible citizens this authority ought to come under the civil power, and that civil magistrates would do well to limit their judgments to the core doctrines Grotius had worked out in Meletius. He developed, though never published, the manuscript of De Imperio as the political conflict continued to escalate during 1614-17. His sympathies with the Arminian theology also grew during this period, and in 1617 he took it upon himself to brush back the charges of heresy with the publication of a theological work, Defensio Fidei Catholicae de Satisfactione Christi adversus Faustum Socinum.

As Grotius was being drawn further into the controversy, it came to consume national politics. The orthodox Calvinists, who were a majority at the national level and now had the backing of Prince Maurits, were demanding a national synod to settle the matter. This set up a standoff between Maurits, the national executive and commander of the armed forces, and Oldenbarnevelt, the most influential figure in the States assembly. Oldenbarnevelt led the elites of Holland, including Grotius, in blocking the synod and managing the dispute at the provincial level. That policy culminated in a decision, when riots broke out in 1617, to authorize local militias to suppress the disorder. Maurits denounced the act as an offense against his military authority, and he seized the opportunity to turn the tide against his political adversaries. At the end of an extended political and military campaign to push the Arminian supporters out of the establishment, he ordered the arrest of Oldenbarnevelt and his key supporters in August 1618. Grotius, with his mentor, was locked up and set for trial. A national synod, the famous Synod of Dort, was scheduled. Though incensed at the military coup d’etat against the sovereign institutions of Holland, Grotius calmly petitioned Maurits and the national States-General to no effect. The trials commenced the following year, and Grotius saw his mentor condemned to death for high treason. On May 18, 1619, his own sentence came down: confiscation of property and life imprisonment.

Although he would strive for the rest of his life to vindicate himself and lift the disgrace of the charges from himself and his family, Grotius entered at the age of thirty-six into his term of imprisonment in the castle Loevestein. The only solace of his confinement was that his family was allowed to reside with him and that on her regular leaves his wife Maria was able to bring back books and papers. The scholar was able to turn his isolation to some greater purpose. In Loevestein, Grotius renewed a number of neglected projects. He wrote, fully in didactic verse, a more systematic treatment of his view that there are essential elements common to all religions and that the doctrines of Christianity were recognizable through reason as the most consistent and highest expression of the common faith. The work, initially composed in Dutch, would serve as the basis for his renowned De Veritate Religionis Christiane (The Truth of the Christian Religion). Through his work in law and legal history, he had conceived the plan of writing a rigorous guidebook on jurisprudence of Holland in the vernacular of the Dutch language. The later publication, in 1631, of Inleidinge to the Hollandsche Rechts-geleerdheid (Introduction the Jurisprudence of Holland) would eventually give his book a status in Dutch law analogous to Blackstone’s Commentaries in the English system. Grotius was convinced that he could achieve the same kind of ordered treatment of the concepts, principles and precedents governing relations at the international level. Closed within the walls of his cell, he reached out for a global view of human affairs and prepared parts of what would become the massive treatise, De Jure Belli ac Pacis (The Rights of War and Peace). At the same time, Grotius was looking beyond the walls of Loevestein with a mind for a more immediate scheme: escape. He knew that he had support in the court of Louis XIII in France, and his hopes for reestablishing himself pointed towards Paris. Maria and the family’s young maid-servant, Elsje van Houwening, hatched the plan for escape. On March 22, 1621, Maria made arrangements for a chest of books to be shipped to the nearby town of Gorcum, then helped her husband into the cramped chest and watched Elsje accompany the guards as they unwittingly delivered their prisoner into the hands of friends. A month later, Grotius was in Paris, separated from his family, exiled from his beloved country, yet free.

The long period Grotius spent in exile saw the publication of his most remembered works. Having secured the support of Louis XIII and being reunited with his family, he prepared several manuscripts that he hoped would restore him to prominence. The Apologeticus, appearing in 1622, was straight to the purpose: it contained a full defense of his conduct as a public official of Holland. Despite his earnest pleas of loyalty and the best efforts of his friends, the States-General spurned his arguments and authorized a bounty on him. He turned his attention to the scholarly projects begun in Loevestein. The treatise on the universal law of nature and nations, divided into three hefty books, grew out of the reflections on the subject he had begun twenty years prior. Its first book developed an account of natural justice, so central to his earlier arguments about the Southeast Asian trade, and laid out a broad framework for judging “controversies of any and every kind, as are likely to arise” (JBP I.I.i)—those among politically sovereign entities, private parties, or rival camps within a state. The lengthy second book provided a grounding for the rights in one’s person, property, and sovereignty (subjects he was revisiting from Mare Liberum and his unpublished commentaries) and a detailed consideration of the ways such rights could be acquired, transferred, lost, and protected by recourse to war. The third book, dramatizing the gap between the prevailing customs of warfare and the demands placed on us by a more humane conscience, considers what responsibilities parties have to all those they impact in wartime and in upholding good faith in efforts to build the peace. Many of the arguments of the work were forged in Grotius’ career as an advocate and public official, though he insists in the Prolegomena to the treatise that his perspective in the work is that of a mathematician, abstracting away from particular facts and controversies of the day. When the first edition of De Jure Belli ac Pacis made its appearance in 1625, its readers would have no shortage of conflicts to which to apply its ideas about war and peace, from the campaigns of conquest and appropriation overseas to the long-raging religious conflicts on the continent that were escalating into what would be the Thirty Years War.

Grotius continued, while in France, to write and visit scholars. His Latin edition of The Truth of Christian Religion came out in 1627. It would become his most widely read and translated work. Despite the unreliability of his pension from King Louis, he turned down some tempting offers to serve as a diplomat for other nations and instead renewed his efforts to rehabilitate his standing in the Netherlands. Upon the death of Prince Maurits, Grotius returned to Holland in 1631 in hopes of finding favor with the new Prince of Orange, Frederick Henry, but an arrest warrant from the States-General forced him to flee and take up refuge in Hamburg. Grotius and his wife remained for more than two years in the city without any great prospects. He set himself to composing a third major tragedy, Sophompaneas (Joseph), which would appear in 1635. By that time, his work on the laws of war had brought opportunity to his doorstep. In 1634, he was called to meet with the Swedish High Chancellor, Oxenstierna, who informed him that the recently slain King Gustavus Adolphus had been a great admirer of De Jure Belli and expressed a desire to bring Grotius into the service of Sweden. A major power, Sweden had risen up as a champion of the Protestant cause in the bloody war that gripped Europe, and Grotius was asked to provide counsel to the young queen and serve as her ambassador to another key power, France. The position required that he renounce his Dutch citizenship in order to declare his loyalty to the Swedish crown. Though he never let go of the hope of returning to his home, he accepted. The de Groot family would once again take up residence in Paris.

As ambassador, Grotius was charged with negotiating the terms of French support for the Protestant alliance. The relations were especially fraught due to the delicate position that the French crown, under the guidance of Cardinal Richelieu, had carved out between its opposition to Hapsburg power and its defense of Catholicism. As France increasingly entered the battle fray, much of Grotius’ duty was directed to the war effort. His scholarly projects from the late 1630s-40s, however, took as their object a long-cherished goal: the reconciliation and peace of the Christian community. He began in 1638 on a scriptural commentary that would deflate Protestant rhetoric charging that the Pope was the Antichrist. That same year he slipped an anonymous treatise through an Amsterdam press defending the lay administration of the Eucharist. He then released two lengthy collections of annotations, one on the New Testament and one on the Old, which emphasized the ethical role of the scriptures over the more divisive questions of theology. Building on the idea of shared core doctrines he had explored in his earlier manuscripts, he frankly promoted his vision for a reconciled faith in an appeal printed in Paris in 1642, Via ad Pacem Ecclesiasticam (The Way to Church Peace). Grotius had great hopes that the time was ripe for this vision, but he was disappointed when his arguments were swallowed up in the same old sectarian vitriol.

Having passed the age of sixty, Grotius met with some relief his recall to Swedish court in 1645. The Queen offered to settle his family in Sweden, but he instead requested a passport so that he could rejoin Maria and pursue opportunities elsewhere. He embarked in August in the midst of a terrible storm that damaged the ship and washed it upon the German coast. The ordeal left him ill and weather-beaten. With the aid of servants, he made it to the town of Rostock where he found a hospice. His condition worsened, and death came on August 28, 1645. Arrangements were made to convey his remains to Delft, where the town of his birth bestowed him with the honor that he could not regain in life by interring his body in the Nieuwe Kerk alongside the most celebrated figures of the republic. Maria resettled in Holland, and their sons set about preparing, from Grotius’ papers, updated editions and previously unpublished manuscripts for the press. De Jure Belli ac Pacis, especially, would come to have enduring influence as the Enlightenment philosophers of the next generations embraced its framework of natural jurisprudence as a model for a modern science of law and morals. His work would become a point of departure for those natural lawyers focusing on the law among nations, from Pufendorf and Barbeyrac to Thomasius and Vattel. It would inspire radical ideas about natural rights and the social contract in the Anglo-American political discourses of Hobbes, Locke, Jefferson and Madison. For the Scottish Enlightenment, it would be required reading, informing the moral theories of Carmichael, Hutcheson, Hume and Smith. As natural jurisprudence gave way to positivism and idealism in 19^th-century European thought, the place of Grotius receded in moral and political theory, but his work would be recovered in the context of emerging ideas about the international legal order as the next century approached. His work is most widely known today among those working on international relations and law, though there has been rapidly expanding scholarship on his contributions to political thought, ethics, and the philosophy of religion.

2. Irenicism and Tolerance

In the politics of the Dutch Republic and with regard to the broader religious strife in Europe, Grotius fashioned himself as an irenicist, one who seeks to bring the different denominations of Christianity together. The inflammatory conflicts among the Christian churches, which remained a persistent cause of war and upheaval in the political life of European societies, was in Grotius’ view largely attributable to excesses of dogmatism (see Heering 2004). If dogmatic claims could be reduced to an agreeable set of core tenets, he reasoned, then the various sects would have grounds for cooperating towards a reunified Christian church while allowing more esoteric matters to be contested without posing a threat to peace. This hope for Christian peace and unity characterizes Grotius’ theologically-oriented works from his early Meletius (1611) to Via ad Pacem Ecclesiasticam (1642), among his latest writings at the height of the Thirty Years War.

a. Religion and Civil Authority

In the early decades of the 17^th century when Grotius’ was cutting his teeth in Dutch politics, the temperature was rising on a theological dispute concerning salvation and freedom of the will. The reformed churches, which had the backing of the civil authorities, were founded on orthodox Calvinist doctrine. The standard Calvinist view of salvation held that God’s choice of who would be saved preceded the act of creation; this grace was, consequently, not a status that could be earned through good works but rather was predestined. This view was consistent with the dominant Protestant interpretations of scripture and represented a social and ethical worldview that was compelling to the reformed faithful. Yet this view also carried the ethically troubling implication that individual choice makes no difference to how one stands with God and, as the Leiden professor of theology, Jacob Arminius, would argue, did not account for elements of scripture that seemed to acknowledge a role for human will. Arminius maintained that God’s saving grace was on offer to anyone while still accepting the basic Calvinist premise that, prior to any human act, God had already determined who He would actually elect to everlasting happiness. The paradox could be resolved by recognizing that God’s grace might be resisted. This elegant solution enabled Arminius to account for freedom of the human will while retaining the key Protestant tenet that grace alone, not works, qualifies the elect. The Arminian view of salvation, to draw on Richard Tuck’s illuminating analogy, understands God’s offer of grace to the elect to be much like a parent’s offer to buy something for a child: “the child can refuse the offer, but he cannot purchase the present himself” (Tuck 1993 p. 182). While representing a significant revision to orthodox Calvinism, this view remained consistent with the larger doctrine.

The political question, however, was whether adherents of the Arminian position should be allowed to teach it within the publically established churches. Grotius’ writings from this period confront both the theological and political aspects of the debate. On the question of theology his sympathies laid with Arminius, and his defenses of the view led up to the publication of the substantial De Satisfactione (published in 1617), which distinguished many of the Arminian tenets from the ‘Socianian’ heresies charged by the view’s opponents. Politically, the Arminian preachers were seeking a policy of toleration within the public churches. Grotius and others aligned with Oldenbarnevelt recognized the advantages of such a policy for preserving quiet in the republic. Characteristically, Grotius saw the policy as rooted in philosophical concerns. As early as the (unpublished) manuscript Meletius (1611), he was developing a philosophy of religion according to which all faiths shared core beliefs about the nature of divinity and its role in human life. While this view stressed commonality, it did not entail pluralism. A religious tradition may possess a stronger claim to truth than others in virtue of its consistency with the central doctrines and the credibility of its supporting testimony; for Grotius, Christianity held this title. (This defense of Christianity is most fully developed in Grotius’ most widely published and popular work, On the Truth of the Christian Religion.) Yet Christian tradition, too, had a further set of core doctrines which were necessary for proper worship and for the promotion of responsible citizenship. The church could accommodate friendly debate over finer matters of theology as long as it was firmly rooted in the necessary articles of faith. This philosophical framework, while not made fully public at the time, undergirded Grotius’ advocacy of the toleration policy, which the States of Holland would eventually adopt.

The policy, Grotius well understood, required not only justification but also legitimacy: in defining acceptable doctrines, the civil authority was asserting itself in sacred matters. Grotius addressed this issue in his 1613 pamphlet defending the toleration policy, Ordinum Hollandiae et Westfrisiae pietas, and went on to develop the argument for the central principles into a major essay on the authority of civil government over the public practice of religion. De Imperio Summarum Potestatum circa sacra (1614-17, unpublished) argued that the supreme civil power holds legitimate authority over all matters concerning the public interest, whether sacred or profane. In addition to finding support from scripture and tradition, Grotius grounds his case on the simple Aristotelian argument that, because the commands of multiple authorities would allow for conflicting obligations, there can be only one supreme authority in a jurisdiction (ch. 1). Holding this authority enables the supreme power, then, to preserve civil peace as well as to promote, through the effects of religion, the formation of obedient and upright citizens. The bulk of the work is thus occupied with defending the plausibility of this conclusion by clearing away misconceptions and by reconciling it both with the variety of forms of political and legal organization and with the special calling of the church. To accept the authority of the civil power in religious matters, Grotius argues, does not imply that magistrates are competent to determine the truth of all fine points of theology: a wise ruler will make use of counsel from the most reliable pastors. With even greater wisdom, a ruler would do well to abstain from pronouncing on all but the most essential articles of faith, those that are necessary for salvation (ch. 6, 9). As an instance of an inessential matter in which a “prudent silence” recommends itself, he offers those “questions about the order of predestination and the reconciliation of human free will with grace” (ibid). The policy of the States of Holland, in this framework, was a form of containment: the policy defined the boundaries of permissible doctrine at the point that would endanger the salvation of those who accept it, while allowing the disagreements inside these bounds to play themselves out. Such was Grotius’ recommendation, in both theory and practice. At bottom, however, the policy had its validity not in view of its laudable tolerance but on Erastian grounds. (The citations in the work acknowledge the influence of Thomas Erastus, who a generation earlier had argued for the supreme authority of the state in church governance.) The central position of De imperio was that any policy issued by the civil power would be valid so long as it did not contradict God’s will. That this Erastian position made room for toleration and contributed to civil peace only added to its appeal.

b. Relations with Non-Christians

The principle of toleration guided Grotius’ handling of the Arminian conflict and also served as an ideal in his view of dealings with non-Christians. Among the groups that had found haven in the Netherlands from the Inquisition were Portuguese Jews, and Grotius was asked during his time as a public official to reconsider what ought to be the policy the States towards the presence and worship practices of Jewish communities. His Remonstrantie on the question was of a piece with his developing philosophy of public religion: Jewish worship could be consistent with the state interest in religion, as Judaism accepted the fundamental doctrines regarding God’s existence and concern for human conduct. The policy recommendation was to afford civil liberties and freedom of worship to Jews, under certain restrictions that would serve to “safeguard” the salvation of Christians. This meant, for instance, that Jewish synagogues would not enjoy the same freedom to preach to Christian audiences that could be granted to Arminian and Calvinist disputants, but Grotius maintained that this encumbered status was preferable to the other options in the field. He opposed forcing Jews to practice Christianity on the grounds that such a policy was incoherent, since faith cannot be forced, as well as sinful, since it would induce people to false professions. An alternative was to forbid Jewish worship altogether, but this would promote godlessness, which would be intolerable. Finally, to those who were calling for expulsion, Grotius gave a sustained response partly grounded in principles of natural law: the social bond that nature establishes among humans should not be severed except as punishment for crime. Jewish practice did not transgress natural law, and its faith supported civic life. It was proper, therefore, that Christians and Jews share social arrangements on the basis of common principles of public order and justice.

The same balance between Christian privilege and the potential for peaceful cooperation underwrote Grotius’ approach towards the expanding relationships between Europeans and non-Christian societies around the world. The principles of natural justice in De Jure Belli ac pacis—which grounded claims to sovereignty, property, and the fulfillment of pacts—were valid and binding in any human encounter, requiring no special relation to God. The principles would oblige us, in Grotius’ famous phrase, “even if we should concede (etiamsi daremus) that which cannot be conceded without the utmost wickedness, that there is no God, or that the affairs of men are of no concern to Him” (JBP Prol. 11). Mutual recognition of natural law provided the basis for any two parties to arrive at just and peaceful terms of association, most notably those concerning trade and alliances. This did not imply that all practices regarding religion were consistent with natural law. Because a sense of justice is not sufficient to motivate humans routinely to do right, the broader human society, even more than civil societies, depends upon religion to maintain order and instill reverence for its norms (see JBP Prol. 20 and II.XX.XLIV.6). To reject God involves not only the “utmost wickedness” but a criminal disregard for human society. Indeed, the two tenets that Grotius identifies—that there is a God and that human affairs are of concern to Him—constitute what he takes to be the core of religious belief, found in all societies. Those who oppose these core beliefs may be punished, by war if necessary, but differences among the religious are not, in themselves, grounds for war (JBP II.XX.XLVI-XLVIII). Pagans, polytheists, Jews and Muslims might fail to accept the “truths” of Christianity, but their participation in the common faith supports the basic ethical structure of society. Christianity, even under non-Christian sovereigns, yet has this privilege: that in virtue of its claim to truth, its adherents must not be punished for teaching the Gospel (JBP II.XX.XLIX). The right to suppress religious doctrine, which De imperio claimed for the civil power, extends only to teachings not essential to Christian salvation.

c. Christian Unity and Peace

The privileged status of Christianity among the world’s religions is the subject of The Truth of the Christian Religion. As in De Jure Belli, composed around the same time, Grotius argues that a basic understanding of divinity and its role in the world is accessible through the use of the natural capacity of reason alone. Such truths include not only the existence and providence of God, but also God’s oneness, perfection, causal responsibility for all that happens, and judgment in the afterlife. The proofs Grotius offers are not original but are borrowed from sources both ancient and recent, owning that people of varying sophistication have long been able to reason back to a necessary and singular ‘first cause’ and to grasp that the perfect nature of such a cause would not neglect the good of all creation (ch. 1). While some of these points require more subtle thought than others, all people can in principle arrive at the conclusions through rational reflection. Christ, however, is known through history. To learn of redemption and of what is required for salvation, one needs access to particular facts about Christ’s coming and His call to the faithful. The relevant facts, still, are supported by reasonable inferences based on reliable testimony (the evangelists), the consensus of historians, and the evidence of miracles performed. This project of deriving religious knowledge through rational investigation is what later philosophers would call “natural religion.” Significantly, Grotius argues that these facts gain further confirmation when one recognizes that the doctrines of Christianity have the greatest intrinsic appeal. The Gospel has this appeal in virtue of the reward it promises (the eternal beatitude of the soul), the quality of its ethical teachings (obeying out of love rather than fear, showing love to neighbors and enemies, and so forth), and the impeccable character of its teacher, Christ (ch. 2). Experience and rational consideration, while sufficient to establish the truth of Christianity, may not convince as readily as inferences from mere reason. Indeed, immediate acceptance is not possible without God’s help. On these grounds, Grotius would argue in De Jure Belli that one may neither punish those who fail to embrace Christianity nor impose belief by force (II.XX.XLVIII). Christians would do better to impress non-believers with their ethical example and offer persuasive arguments for conversion.

To this end, De Veritate provides a detailed debunking of other faiths. While its arguments reveal that Grotius undertook a serious study of non-Christian religions—with the aid of friends such as the Hebrew and Arabic scholar, Thomas Erpenius—some of his characterizations are far from generous, repeating old slurs about Jewish animosity towards Christians and the violent character of Islam. The arguments of the book were, after all, calculated to more than one purpose. Grotius intended the book to be of special use to seamen, whom while off to many corners of the earth to establish Dutch trading interests, would encounter a dazzling diversity of religious belief that might not only elude their attempts at persuasion but also challenge their own faith. It was the Christian reader, most of all, who may need to be assured of the Gospel’s special claim to truth.

The further effect Grotius hoped De Veritate would have on its Christian readers was to impress upon them that, in the range of religious diversity, the similarities among Christians are much more significant than the differences. The irenicist program that Grotius pursued in his later years had two main prongs. The first provided a map for Christian reunification based upon minimal agreement regarding core doctrines, beyond which some difference of belief and practice could be accommodated. The second urged Christians to recognize that the most important lessons to be taken from scripture are its ethical teachings, not its dogmas. This was the simple, practical faith that he saw reflected in the earliest Christian community and in the Christian humanists, like Erasmus, whom he so much admired. It was also a faith of which civil authorities, responsible for civic peace and virtue, could be worthy custodians.

3. Sovereignty and Imperialism

Connecting the political and international thought of Grotius is his conception of sovereignty, the supreme right of governing (summum imperium). The mark of the sovereign power is that it “cannot be made void by any other human will” (JBP, I.III.viii). Within a state, it is the highest authority; internationally it encounters other sovereign powers, among whom none holds a superior right.

a. Divisible Sovereignty

The guiding idea in Grotius’ treatment of sovereignty, as with his treatment of rights generally, is that systems of rights are radically alterable through the ways people choose to dispose of those rights. As a result, societies will vary widely in how they organize the powers of sovereignty. Philosophers might argue for the advantages of one scheme or another, “but as there are several ways of living, some better than others, and every one may choose which he pleases of all those sorts; so a people may choose what form of government they please: neither is the right which the sovereign has over his subjects to be measured by this or that form, of which divers men have divers opinions, but by the extent of the will of those who conferred it upon him” (JBP I.III.viii). What justifies a scheme of rights is that it has arisen from the historical choices of their legitimate holders, not any features of its form. This principle gave Grotius a great deal of flexibility in defending different political arrangements, provided the facts of history for the given society would play along.

On one side, Grotius was able to argue against royalists who sought to define sovereignty as an indivisible package of prerogatives that could be vested in only a singular will. Grotius takes this claim, which Jean Bodin had advanced a generation earlier, at face value but treats indivisibility as a purely conceptual point: to institute civil power in a society consists in gathering up a certain package of governmental rights and in designating who will hold that power supremely. The rights of governing come as a package, but a society may, if it chooses, designate different holders for the various rights.

Grotius developed this position early in his career in an unpublished manuscript that he called Commentary in Eleven Theses. The practical divisibility of sovereignty is an indispensable premise for the political argument of the work, which defends the ongoing Dutch war against the rule of the king of Spain. Unlike earlier apologists, Grotius does not conceive of the war as a revolt based on right of a people to resist a tyrannical ruler but rather as a war between sovereign powers (see Borschberg 1994 pp. 169ff. and Keene 2002 pp. 45ff.). If one studies the history of rights in the Dutch case, Grotius argues, one finds that the Dutch people did not transfer all governing rights to a prince bur reserved some, in particular the right to levy taxes, to the States of Holland. While holding supreme power on many matters, the Spanish king had sought to usurp a further supreme power from the States, an act which provided them a just cause to wage war in defense of its right. Put in the language of sovereignty, the king possessed no right to render void the will of the States when it came to taxation, just as this particular right of the States could not render void the king’s rights in other matters: each was supreme within the scope of its own authority (cf. JBP I.IV.xiii). Grotius retained and systematized this conception of divisible sovereignty in De Jure Belli, where he also considered the criticism that such arrangements based on divided powers were recipes for civil strife. His answer insists on the principle with which he began: while one can point to inconveniences in any arrangements, the only relevant question in matters of right is whether those arrangements were the ones chosen (I.III.xvii).

On the other side of the political spectrum, Grotius argued against theories of popular sovereignty. The position of constitutionalist thinkers, such as those among the reforming Huguenots who would come to be called ‘monarchomachs,’ was that the right of kings to rule derives from the rights of the people; since some of these rights are inalienable, the representatives of the people retain a right to resist a regime that tyrannically usurps these rights. Grotius’ response was to grant that rights originate from the people but to argue that the people can choose to alienate whatever rights they wish, even up to the extreme of enslaving themselves to another (JBP, I.III.viii). Utter subjection to an absolute monarch is, therefore, entirely possible and consistent with the history of political arrangements in many societies. Grotius’ flexible approach enabled him to defend the republican principles alive in the Dutch provinces from one side of his mouth while shoring up the absolutist claims of his later patrons from the other. In his defense of the latter claims, we find Grotius even paying homage to the time-worn doctrine of Aristotle that some people are naturally suited to be slaves. Importantly, Grotius does not admit the doctrine as grounds for imposing slavery but rather repurposes it: the doctrine can explain why a people might choose of their own accord to hand over their full rights to the more prudent government of another. Ineptitude at self-rule, it turns out, is just one of many considerations that might factor into the selection of a form of government.

b. Resistance, War and Empire

Grotius’ understanding of sovereignty carries several implications for his theory of just war. The first concerns his position on the “right of resistance,” the hotly contested question of whether a subject people may ever justly depose a ruler for misgovernment. While Grotius rejects constitutionalist arguments that reserve inalienable rights to the people, he finds a way to preserve this rationale for resistance in a more limited form. It is unlikely that most civil societies would have been founded on utter subjection. In the absence of clear evidence that subjects have completely alienated their rights, one has to presume that rational people would have preserved their most basic rights against arbitrary treatment. This presumption attaches only in cases of “extreme necessity,” as when a government turns its sword on innocent subjects, and then only when resistance could be carried out without creating an even bloodier civil conflict (I.VI.vii). When Grotius invokes this argument from extreme necessity, he relies on what Richard Tuck has called a kind of interpretive charity (1979 pp. 79-80): since civil authority is a human institution, the bounds of which are derived from the wills of those who established it, one must credit the founders with intentions that would rationally advance, not undermine, the aims of civil association. (Compare the parallel reasoning in limiting the rights of property, II.II.vi.) Second, Grotius assigns a role in this context to third-party humanitarian intervention. Even if it should turn out that subjects must bear the most arbitrary assaults from their proper sovereign, a third-party would remain free from the special obligations that constrain subjects from resisting and could intervene on their behalf. Such interventions should only be attempted when it is evident that a government is committing gross injustices against its people—“such Tyrannies over subjects, as no good Man living can approve” (JBP II.XXV.viii). The third implication concerns Grotius’ complicated relation to imperialism. In defending the legitimacy of diverse forms of political authority, he is rejecting the principle behind those forms of imperialism that seek to impose a more enlightened form of rule for the good of the governed. Elsewhere in De Jure Belli he explicitly refutes the argument that slavery can be imposed on those who might be naturally suited to it (II.XXII.xii) and castigates those who claim rights of ‘discovery’ over lands already occupied by supposedly less enlightened folk (II.XXII.ix). On these points, he is in agreement with earlier critics of the Spanish conquests such as Francisco de Vitoria and Bartolome de las Casas.

The strategies of commercial imperialism, which characterized Dutch practice, found much more support in Grotius’ theory of just war (see generally, Tuck 1999 ch. 3, van Ittersum 2006, Wilson 2008, Thomson 2009). The whole concern of De Jure Belli is how to justly settle controversies in the dealings of those who do not live under a shared system of civil laws. In the context of global trade, such dealings will involve the claims of private parties as well as the contentions of kings and states. It ultimately falls to each party, when operating outside the jurisdiction of a common court, to judge the controversy based on the applicable standards of natural, customary, state and divine law. Significantly, Grotius maintains that such relations can be peaceful so long as those involved have a clear understanding of the law and hold themselves to norms of justice, equity, temperance, and humanity. Yet, just as magistrates duly back their rulings with force, those involved in a dispute have the right to redress injuries by means of war. Used rightly, De Jure Belli would provide all parties with a clear understanding of how the law applied to various disputes and educate them in how to render fair and responsible verdicts. However, used rightly, it would also give trading powers the flexibility to leverage their arrangements with non-Europeans and the justifications to uphold these arrangements with force. One stratagem it enabled was encroachment on local sovereignty (see Keene 2002 pp. 48ff and 79ff). Grotius’ position was firmly that non-Christian rulers could hold full title to sovereignty, but his view of sovereignty was that its marks could be divided up among various holders. A foreign trading power might enter into an alliance with a ruler that required him, for instance, to provide land for a trading ‘factory’ or deliver up his people’s labor. These arrangements do not, in themselves, transfer any mark of sovereignty, but Grotius argues that, if the foreign power (unjustly) usurps this right over time without being challenged, its “long possession” provides it with a claim to sovereignty that is now just (JBP I.III.XXI.10-11). Because marks of sovereignty can be divided off in this way, the foreign power can take over limited rights of its own without being guilty of usurping the broader authority of the king. Once the limited right was established, however, it could also be protected with force should the king try to reconsolidate his power (by the same right that the Dutch defended their limited sovereignty against the ambitions of their Spanish overlord). Had the rulers of Southeast Asia read Grotius’ work, they might have found a useful warning about the risks of getting entangled with a powerful ally; the readers among the European mercantile class would also see its usefulness.

The natural-right framework of De Jure Belli also empowers parties to a contract to arrive at their own judgments about how to interpret indeterminate clauses (JBP II.XVI) and authorizes any party, public or private, to execute punishment for culpable violations of the law (II.XX). The idea that war-making can be understood as an extension of the right to punish had been part of the Christian just-war tradition from Augustine through Vitoria and Suarez, but Grotius reconceives punishment as a natural right that obtains prior to civil authority (see Tuck 1999 pp. 102f. and Straumann 2006). In circumstances beyond civil jurisdiction, law-respecting persons can take it upon themselves to police and punish crimes affecting society. Because this exercise of power over another assumes a position of superiority, Grotius recognizes the need to explain how this difference in standing can arise among those who are equal by nature. His solution is to point out that violators demote themselves beneath the rest of humanity (JBP II.XX.iii). Anyone who remains in this position of moral superiority can properly execute punishment. The natural right to punish was an important innovation in Grotius’ early De Indis, where he argued that Dutch merchants had legitimate authority to punish the Portuguese for monopolizing the seas (fol. 40). It remains a key feature of his theory of punishment in De Jure Belli, where it provides a further source for just causes to resort to war. In contrast to the anti-imperialist arguments of Vitoria and the school of Salamanca, which had maintained that the princes of Europe had no authority to punish those beyond their jurisdiction except in response to ‘an injury received’ (On the Law of War q.1 a.3; see also On the American Indians q. 2 a.5), Grotius opens the door to punitive war against those who commit ‘crimes against nature.’ Elevated as moral superiors above regimes that enjoin or condone manifestly unjust practices—including cannibalism, piracy, the oppression of their own people or the cruel treatment of foreigners—outside powers may seek to punish these regimes in the interests of human society (II.XX.XL). Adopted while Grotius still had ties to the interests of the Dutch trading companies, this interventionist stance would have expanded the range of justifications available for colonizing lands in both Asia and the Americas (see Tuck 1999 pp. 103-4 and van Ittersum 2010).

At the same time, Grotius shows an awareness, and some discomfort, that his position could be used as a pretext for expansionist wars. He cautions that only violations of universal norms, not of the evolving customs of Europe, count as punishable offenses. Quoting Plutarch, he explicitly warns of the lurking temptations of imperialism: “To wish to impose civilization upon uncivilized peoples is a pretext which may serve to conceal greed for what is another’s” (II.XX.XLI). The structure of Grotius’ position, characteristic of the framework of De Jure Belli, both insists on strict adherence to norms of justice, equity and humanity while still affording the powerful the flexibility to interpret, judge and enforce those norms by their own lights.

4. Natural Right and the Law of Nations

The broadest principles of just war in De Jure Belli ac pacis derive from two sources: the norms of natural justice and the customary law of nations (ius gentium). (Other human and divine laws, importantly, also lay down binding principles for those who have received them, but these sources do not have the universal character of the laws of nature and nations.) On any given question regarding the resort to war or its conduct, both systems of law must be consulted, as each system is capable of influencing the rights and obligations of the other.

a. Obligations from Nature and Custom

The account of natural law in De Jure Belli, heavily influenced by the Stoic notions of Cicero, begins from two universal human concerns: self-preservation and social connection (see JBP I.II.I and Prol. 6-8). The rights of obligations of natural law are all justified in terms of the rational balancing of these two primary concerns. This approach is an outgrowth of Grotius’ earliest work on the laws of war, De Indis, where he argued that the imperative of self-preservation justified two permissions of natural law: to defend one’s life and to acquire possessions (fol. 5’-6). The need for human fellowship justifies two basic obligations towards others: to refrain from inflicting injury and from seizing their possessions (fol. 6’-7’). One apparent change that Grotius makes to his earlier theory regards the basis for these obligations. In De Indis, he aligns himself with a voluntarist account of obligation, found in medieval thinkers such as Ockham, which maintains that natural law is binding upon humans in virtue of the divine will that commands it (fol 5’). The design of nature is one way in which we receive God’s commands. By the time of De Jure Belli, Grotius seems to accept the alternative, intellectualist position that natural law binds us by teaching what both humans and God can recognize as necessary for human life: it shows us not what is obligatory because commanded but what is obligatory or permissible “in itself” (JBP I.I.x). In fact, there is much ambiguity in the later work as to which position Grotius accepts, showing itself even in his very definition of natural law as “a dictate of right reason, which points out that an act has in it a quality of moral baseness of moral necessity; and that, in consequence, such an act is either forbidden or enjoined by the author of nature, God” (JBP I.I.x). This definition is perhaps closest to the ‘mediating’ position more recently advanced by Suárez, maintaining that intellect could recognize what is, in itself, good or bad for humans but that only God’s command makes it obligatory to live accordingly (De Legibus II.VI; see Schneewind 1998 pp. 61 and 74).

What is clear is that Grotius draws a basic distinction in law, following Aristotle, between obligations derived from nature and those derived from an authoritative will (JBP I.I.ix and xiii-xvi). Sources of this second, ‘volitional’ type of law can be divine (as revealed in scripture) or human, and the latter includes not only the laws of particular states but also those laws that nations accept in their relations with each other. Kings and peoples give their assent to the law of nations through custom, not typically by positive agreement. Long observance of a norm in the relations between states gives it the force of law. In contrast to natural law, which confers its basic rights and obligations to all persons whether in a private or public capacity, the law of nations applies to relations between sovereign entities (cf. JBP Prol. 40; De Indis fol. 12ff). It deals, accordingly, largely with matters of state, such as embassies, treaties, and the special privileges of sovereigns in waging war. This system of customary law, in turns out, makes the legal position of sovereigns radically different from that of private actors in the ‘universal society’ established by natural law.

b. Just War: Jus ad Bellum

The mutual influence of the laws of nature and nations can be seen in both the resort to war (traditionally called the jus ad bellum) and in its conduct (jus in bello). The only just grounds for resorting to war are those that involve the pursuit of a right. Among such pursuits, Grotius identifies three kinds: self-defense, the recovery of property and punishment. Each of these has its basis in natural law, though the particular rights at issue might arise from other sources, such as the law of nations. The right of self-defense arises from the natural permission every person has to protect against injury (II.I.iii). If our primary concern is self-preservation, we could not take the risk of living among other people without reserving the permission to protect ourselves from them. The right of defense extends not only to one’s life, but also to one’s body and property. Grotius argues that killing in defense of one’s body is justifiable even if the assailant’s objective is not to kill but to maim or rape (II.I.vi-vii). The reason is that one can never trust that a physical assault will not result in death (though it is unclear in Grotius’ treatment of rape whether it is the victim’s life or interests of men in her ‘chastity’ that is the justifying concern). There are two constraints on justified self-defense: that the attack is imminent and certain (II.I.v). Defense is a just cause that applies only to immediate danger. Even property, however, may be defended with lethal force, with the further constraint that such force is necessary for retaining it (II.I.xvi).

Apart from defense, war may be waged in order to recover one’s rights or to punish the offender. Acting under these just causes will often entail being the one to initiate violence. Grotius argues that this breach of peace is not anti-social (and hence in violation of natural justice) because the initiator is only demanding what the other party already owes (I.II.i.5-6) – they are not violating but upholding the system of rights. Recovery of property applies not only to moveable things and territory, but also to rights over persons (such as rightful subjects or slaves), rights to actions (such as the fulfillment of contracts), and compensation for damages. All of these might be claimed by natural right, though the particular claims might be shaped by prevailing domestic systems of property or by the law of nations. This single heading yields an expansive range of cases in which war is a just option for enforcing rights. Punishment multiplies such cases. When someone willfully violates a right, they become obligated not only to make restitution but to endure punishment equivalent to their crime. Any law-respecting person (as explained above) may execute this punishment, in principle, though a number of factors will tend to limit international punishment. Due to the high risk of harming the innocent in pursuit of the guilty, punitive wars are permissible only for serious crimes (II.XX.xxxviii). In most circumstances, only sovereign governments will be permitted to execute the punishment since individual citizens would have transferred this natural right to their state (see II.XX.xxiv and II.XX.xl; cf. De Indis, fol. 40-40’). Public authorities, therefore, can lay claim to special punitive causes such as the punishment of crimes against natural society (see above) and anticipatory defense. Whereas only an actual attack can justify self-defense, a plot to attack, once set in motion, is already a crime (II.I.xvi). Under the cause of punishment, a state may resort to preemptive warfare which defense alone could not justify. Finally, every exercise of punishment must be limited to the achievement of certain goods. While the right to punish has a retributive justification rooted in the offender’s obligation to endure it, the exercise of this right ought to be governed by consequentialist considerations. The good of the offender, of the victim and of the broader society, are all relevant benefits that need to be weighed against the harms to each of these (II.XX.iv-ix). Especially when the consequences of punishment include a broader war, these considerations may urge clemency, restraint or even pardon (II.XX.xxii-iv and xxxiv-xxxvi; see II.XXIV.ii-iii).

There is a general pattern of argument—that people are permitted, in the strictness of justice, to use violence in a great many cases that will nonetheless call for moderation in the name of humanity and peace—that characterizes the whole of De Jure Belli ac Pacis. Justice is a crucial virtue, as the maintenance of society and respect of law require it, but its guidance is limited to these minimal aims. To know what the laws ought to be and to decide when and how far to exercise one’s rights, it becomes necessary to follow the promptings of equity, humanity and prudence. These “virtues which have as their object the good of others” (I.I.viii) not only serve to measure the proper severity of punishments but also to determine whether war for a punitive cause is warranted at all. Humaneness imposes a moral limit, too, in how far one ought to press rights to property, so as not to use market power to squeeze people (II.XII.xvi) or to withhold vital information when making contracts (II.XII.ix). Even in self-defense, the resort to war can have humanitarian consequences that speak strongly against making full use of one’s right (II.I.iv, viii, ix and xi). It would be a grave error, Grotius warns, to think that “where a right has been adequately established, either war should be waged forthwith, or even that war is permissible in all cases” (II.XXIV.i). The resort to war must be squared not only with justice but with humanitarian concerns, especially for its impact on the lives of innocent people. This loving regard for others that aspires to universality is what Grotius held up, in his works on religion, as the great ethical appeal of the Gospel, and De Jure Belli instructs its readers to recognize that not only humanity but also God calls them to love, forbearance and restraint.

c. Just War: Jus in Bello

The meshing of these normative standards of justice and humanity is especially pronounced in Grotius’ treatment of the conduct of war in Book III of De Jure Belli. The natural law provides but one basic rule for the conduct of war: “things which lead to an end receive their intrinsic value from the end itself” (JBP III.I.ii). That is, if one has a right to resort to war, then one has a right to conduct the war by whatever means are necessary to vindicate the just case. Grotius finds natural justice an unsatisfactory basis for the ethics of combat for two main reasons: (i) it permits inhumane and intemperate actions on the part of those who fight under a just cause, and (ii) it provides no guidance whatsoever for those who fight under an unjust cause. The answer to the first deficiency is Grotius’ account of temperamenta, discussed below. The second deficiency finds its solution in the law of nations. Grotius recognizes that while no war can be naturally just on both sides—a right on one side precludes a right on the other—wars may be either unjust on both sides or justifiably believed to be just on both sides. In either case, there are belligerents for whom natural justice provides no guidance other than, ‘your cause is unjust: stop fighting.’ Grotius resigns himself to the realism that, aside from exceptional cases, most states will not admit to the injustice of their cause and simply stop fighting. The longer such states fight, the more injustices they pile up by resisting the just party. Before long there would be no limit to the punitive war that could be prosecuted against the unjust state (see III.IV.iv). Grotius suggests that nations, recognizing the perils of this situation, established a custom of holding both parties in a war to have equal standing on the battlefield. That is, the law of nations permits to both sides (regardless of the justice of their cause) all the actions that the natural law would permit to the just.

The customs of warfare under the law of nations turn out to be extremely permissive. Tracking the prevailing practice of states, the customs permit everything from the slaughter of innocents to the taking of slaves and the looting of civilian property. License to conduct warfare in this way is the special privilege of sovereigns who have ‘solemnized’ their war under the law of nations. Indifferent to the substantive justice of a state’s cause, the law of nations insists instead on certain formalities—a public declaration by the sovereign authority—to give the belligerent its legal status in a solemn war (I.III.iv and III.III). While Grotius defends this status as a way of restoring normal relations between sovereigns at the end of war, he insists that even kings remain accountable to natural justice. The law of nations is derived from human will, and the license it gives in solemn wars cannot contradict the requirements of natural law. The license amounts to an agreement among nations not to punish each other for certain acts (III.IV.ii-iii). So, after many lengthy chapters detailing the range of actions permitted by the law of nations, Grotius takes an abrupt turn, telling the reader that he must now retrace his steps and “deprive those who wage war of nearly all the privileges which I seemed to grant, yet did not grant to them” (III.X.i). Those waging a solemn war may have the privilege of impunity under human law, but a ‘sense of shame’ ought to instill a respect not only for the ‘external’ judgments of the courts but for the ‘internal’ judgments of conscience (III.X and III.XI.i-ii). Those waging an unjust war will be accountable to God, and they have an (unenforceable) obligation to make restitution to those they have wronged. Even those waging war for a just cause should observe the limits of natural justice by sparing the innocent and pursuing only those war aims that are necessary to securing one’s rights. Conducting war merely within the bounds of the law of nations may obtain impunity, but it brings no badge of honor.

What makes kings and peoples worthy of honor is their observance of temperamenta: moderation and restraint in pursuing their just claims. Such restraint comes out of a respect for justice—by restricting the means of war to only what is necessary to achieving the ends—and also out of a sense of humanity. This humane concern for others seeks to limit the impact of war on the innocent and even those fighting on the opposing side (see, for example, III.XI.viii, XII.viii, and XIII.iv). It requires in many cases the remission of punishment, to forgiveness of burdensome war debts, and a preference for restoring local sovereignty rather than imposing imperial rule. At all events, one must uphold good faith in agreements made with the other side in order to build the basis for normal relations after the war (III.XXI-XXV). Humanity holds in view not only the aim of restoring rights but of restoring peace (see III.XXV.ii-iii). Justice might condone war against injuries that threaten the basis for living together in society, but a sense of humanity is fostered by the recognition that we must live together again.

5. Scholarly Interest in Grotius

In the century following his death, Grotius’ works came to be viewed as pivotal in the development of early modern moral and political philosophy. Jean Barbeyrac, in his 1749 essay on the emerging Science of Morality, described Grotius as “breaking the ice” of medieval dogma to make way for a rational approach to ethics. The natural law philosophy of the seventeenth and eighteenth centuries—from Pufendorf to Locke, Vattel and Thomasius—took the framework of De Jure Belli ac pacis as a point of departure. This canonical status made Grotius required reading for Enlightenment intellectuals, such that Rousseau would come to describe him in Emile, however critically, as “the master of all the savants” and Adam Smith would credit him in his lectures on jurisprudence as giving the world the most systematic treatment of the subject to date. The 21^st century has seen a renewed debate among scholars over the extent of Grotius ‘originality’ in moral thought and in what it consists: the purported secularism of his approach, its rationalism, its refutation of skepticism, its account of obligation, or a variety of other candidates. Beyond these disputes, recent historians of moral and political philosophy have taken special interest in Grotius’ conception of natural rights, his theory of punishment, and his accounts of property and state sovereignty.

Grotius’ legacy, however, is most strongly connected to his contributions to international legal theory and the laws of war. Interest in Grotius saw a revival in the late nineteenth century amid efforts to articulate and institutionalize norms of international law. The peace societies of the time, closely bound up with the international women’s suffrage movement, traced back to the Grotius the evolving conscience of the ‘civilized’ world towards justice and mercy in international conflicts. Andrew Dickson White, the U.S. delegate to the 1899 Hague Peace Conference, regarded Grotius—whom he classed among the world’s Seven Great Statesmen in the Warfare of Humanity with Unreason—as providing the “real foundation of the modern science of international law.” While the claim to being ‘father’ of this law was as disputed as it was common, and despite many critical views of this work—in his 1925 history of political philosophy, Charles Vaughan had called De Jure Belli a “nest of sophistries and contradictions”—Grotius came to have a canonical status in international legal thought. By the end of the Second World War, the legal scholar Hersch Lauterpacht was able to discern a ‘Grotian tradition in international law’ rooted in commitments to the rule of law, to norms beyond positive law, and to the human capacity for moral progress in the law. Grotius continues to be most widely known within the study of just war theory and international law, most notably for the contribution of Mare Liberum to the modern law of the sea.

The preeminence of Grotius in the field of international law exerted its influence as well on the development of international relations theory. Theorists of international relations have commonly viewed Grotius as providing a distinctive conception of international society that provides a middle way between Hobbesian anarchy and Kantian cosmopolitanism. In this schema of ‘realist,’ ‘rationalist,’ and ‘revolutionist’ theories, proposed by Martin Wight and pursued in the work of Hedley Bull and others of the ‘English School’ of international relations theory, the Grotian tradition provides a rationalist account of international society. While rejecting the idea that there are common interests among states sufficient to underpin a supranational authority, the Grotian system identifies a ‘solidarity’ of interests around basic principles of order (such as mutual independence, adherence to promises, the limitation of war) that enables sovereign states to constitute their relations as a (limited) community rather than as a contest governed by the dynamics of power alone. The association of Grotius with this strain of thought has given his work enduring interest in contemporary international theory.

While reaching the greatest prominence in international thought, the early 21^st century scholarship on Grotius has a markedly interdisciplinary character. His works have received considerable attention from political theorists and historians of political thought, as well as by those studying his contributions to moral philosophy, theology and literature. Indeed, the eclecticism of Grotius’ thought pushes beyond modern disciplinary boundaries and springs up continuing dialogues across fields and borders.

6. References and Further Reading

Included in the Primary Sources are selected works of Grotius with a preference for most recently in-print English editions. (Note: references to De Jure Belli in the article provide the book, chapter and section numbers, e.g., II.XXIV.i.). The selected secondary sources include references from the article as well as suggested directions for further reading. The interested scholar will also want to consult the regularly published journal of Grotius studies, Grotiana.

a. Primary Sources

Grotius, H. (2006). De Jure Praedae Commentarius / Commentary on the law of prize and booty. Indianapolis, Liberty Fund.
Grotius, H. (1994). “Commentarius in Theses XI”: an Early Treatise on Sovereignty, the Just war, and the Legitimacy of the Dutch Revolt, P. Lang.
Grotius, H. (2004). The Free Sea. Indianapolis, IN, Liberty Fund.
Grotius, H. (1988). Meletius. Leiden, Netherlands, Brill.
Grotius, H. (1990). Defensio Fidei Catholicae de Satisfactione Christi, adversus Faustum Socinum Senensem. Assen/Maastricht, the Netherlands, Van Gorcum.
Grotius, H. (2001). De Imperio Summarum Potestatum circa Sacra. Studies in the history of Christian thought, v. 102. H.-J. v. Dam. Leiden, Brill.
Grotius, H. (1926). The Jurisprudence of Holland. R. W. Lee. Oxford, Clarendon Press.
Grotius, H. (2005). The rights of war and peace. Indianapolis, Liberty Fund.
Grotius, H. (1962). De Jure Belli ac pacis libri tres / The Law of War and Peace. Indianapolis, Bobbs-Merrill.
Grotius, H. (2012). The Truth of the Christian Religion. Indianapolis, Liberty Fund.

b. Secondary Sources

Borschberg, P. (1994). “Critical Introduction.” “Commentarius in Theses XI”: an Early Treatise on Sovereignty, the Just War, and the Legitimacy of the Dutch Revolt. H. Grotius, P. Lang.
Brett, A. (2002). “Natural Right and Civil Community: The Civil Philosophy of Hugo Grotius.” The Historical Journal 45(01): 31-51.
Bull, H., B. Kingsbury, et al. (1990). Hugo Grotius and International Relations. New York, Clarendon Press.
Dumbauld, E. (1969). The Life and Legal Writings of Hugo Grotius. Norman, University of Oklahoma Press.
Forde, S. (1998). “Hugo Grotius on Ethics and War.” American Political Science Review 92(3): 639-648.
Haakonssen, K. (1985). “Hugo Grotius and the History of Political Thought.” Political Theory 13(2): 239-265.
Heering, J. (2004). “Hugo Grotius’ De Veritate Religionis Christianae.” Hugo Grotius as Apologist for the Christian Religion: a Study of his Work De veritate Religionis Christianae, 1640. J. Heering. Leiden, Brill: 41-52.
Keene, E. (2002). Beyond the Anarchical Society: Grotius, Colonialism and Order in World Politics, Cambridge University Press.
Kinsella, H. M. (2006). “Gendering Grotius: Sex and Sex Difference in the Laws of War.” Political Theory 34(2): 161.
Meijer, J. (1955). “Hugo Grotius’ “Remonstrantie”.” Jewish Social Studies 17(2): 91-104.
Nellen, H. a. R. E., Ed. (1994). Hugo Grotius Theologian: Essays in Honor of G.H.M. Posthumus Meyjes. New York, Brill.
Onuma, Y., Ed. (1993). A Normative Approach to War: Peace, War, and Justice in Hugo Grotius. Oxford, Clarendon Press.
Schneewind, J. B. (1998). The Invention of Autonomy: A History of Modern Moral Philosophy. New York, Cambridge University Press, ch. 4.
Straumann, B. (2006). “The Right to Punish as a Just Cause of War in Hugo Grotius’ Natural Law.” Studies in the History of Ethics.
Suárez, F. (1944). De Legibus. Selections from Three Works. New York: Clarendon Press.
Thomson, E. (2009). “The Dutch Miracle, Modified. Hugo Grotius’s Mare Liberum, Commercial Governance and Imperial War in the Early-Seventeenth Century.” Grotiana 30(1): 107-130.
Tuck, R. (1993). Philosophy and Government, 1572-1651. New York, Cambridge University Press, ch. 5.
Tuck, R. (1999). The Rights of War and Peace : Political Thought and the International Order from Grotius to Kant. New York, Oxford University Press, ch. 3.
van Gelderen, M. (1993). “Vitoria, Grotius and Human Rights: The Early Experience of Colonialism in Spanish and Dutch Political Thought.” Human Rights and Cultural Diversity. W. Schmale. Goldbach, Germany, Keip Publishing: 215-238.
van Gelderen, M. (2006). ‘So Meerly Humane’: Theories of Resistance in Early-Modern Europe. Rethinking the Foundations of Modern Political Thought. A. S. Brett, J. Tully and H. Hamilton-Bleakley. New York, Cambridge University Press: 149-170.
van Ittersum, M. J. (2006). Profit and Principle : Hugo Grotius, Natural Rights Theories and the Rise of Dutch Power in the East Indies, 1595-1615. Leiden, Brill.
van Ittersum, M. J. (2010). “The Long Goodbye: Hugo Grotius’ Lustification of Dutch Expansion Overseas, 1615-1645.” History of European Ideas 36: 386-411.
Vitoria, F. d. (1991). On the American Indians. Political writings. A. L. J. Pagden. New York, Cambridge University Press.
Vitoria, F. d. (1991). On the Law of War. Political writings. A. L. J. Pagden. New York, Cambridge University Press.
Vreeland, H. (1917). Hugo Grotius, the Father of the Modern Science of International Law. New York, Oxford University Press.
Wilson, E. M. (2008). The Savage Republic: De Indis of Hugo Grotius, Republicanism, and Dutch Hegemony within the Early Modern World-System (c. 1600-1619), Martinus Nijhoff Publishers.

Author Information

Andrew Blom
Email: andrew.blom@cmich.edu
Central Michigan University
U. S. A.

The Moral Permissibility of Punishment

The legal institution of punishment presents a distinctive moral challenge because it involves a state’s infliction of intentionally harsh, or burdensome, treatment on some of its members—treatment that typically would be considered morally impermissible. Most of us would agree, for instance, that it is typically impermissible to imprison people, to force them to pay monetary sanctions or engage in community service, or to execute them. The moral challenge of punishment, then, is to establish what (if anything) makes it permissible to subject those who have been convicted of crimes to such treatment.

Traditionally, justifications of punishment have been either consequentialist or retributivist. Consequentialist accounts contend that punishment is justified as a means to securing some valuable end—typically crime reduction, by deterring, incapacitating, or reforming offenders. Retributivism, by contrast, holds that punishment is an intrinsically appropriate (because deserved) response to criminal wrongdoing. Each type of account has been roundly criticized, on a variety of grounds, by theorists in the other camp. In an effort to break this impasse, scholars have attempted to find alternative strategies that incorporate certain consequentialist or retributivist elements but avoid the standard objections directed at each. Each of these accounts has, in turn, met with criticism. Finally, abolitionists argue that none of these defenses of punishment is satisfactory, and that the practice is morally impermissible; the salient question for abolitionists, then, is how else (if at all) society should respond to those forms of wrongdoing that we now punish.

This article first looks more closely at what punishment is; in particular, it examines the distinctive features of punishment in virtue of which it stands in need of justification. It then highlights various questions that a full justification of punishment would need to answer. With these questions in mind, the article considers the most prominent consequentialist, retributivist, and hybrid attempts at establishing punishment’s moral permissibility. Finally, it considers the abolitionist alternative.

What is Punishment?
Various Questions
Consequentialist Accounts
Retributivist Accounts
Alternative Accounts
Abolitionism
References and Further Reading

1. What is Punishment?

When we consider whether punishment is morally permissible, it is important first to be clear about what it is that we are evaluating. Theorists disagree about a precise definition of punishment; nevertheless, we can identify a number of features that are commonly cited as elements of punishment.

First, it is generally accepted that punishment involves the infliction of a burden. The state confines people in jails and prisons, where liberties such as their freedom of movement and association, and their privacy, are heavily restricted. It imposes often heavy monetary sanctions or forces people to take part in community service work. It subjects people to periods of probation during which their movements and activities are closely supervised. In the most extreme cases, it executes people. Theorists disagree on precisely how to characterize this feature of punishment. Some describe punishment as essentially painful, or as involving the infliction of suffering, harsh treatment, or harm. Others instead write of punishment as involving the restriction of liberties. However we characterize the specific nature of the burden, it is relatively uncontroversial that punishment in its various forms is burdensome.

One might object that some prisoners could become accustomed to incarceration and so not see it as a burden, or that the masochist might even enjoy his corporal punishment. In response to supposed counterexamples such as these, a defender of the “burdensomeness” feature of punishment might argue that the comfortable prisoner and the masochist are still punished insofar as they are treated in ways that are typically regarded as burdensome by those on whom they are inflicted. Alternatively, one might argue that a particular case of incarceration, corporal punishment, and so forth, indeed does not count as punishment if the prisoner does not find it burdensome (Boonin, 2008: 8-10). Whatever one makes of these attempted counterexamples, it remains the case that punishment theorists by and large agree that burdensomeness is an essential feature of punishment.

But punishment is not merely burdensome. A second widely accepted feature of punishment is that it is intended to be burdensome. This feature distinguishes punishment from other forms of treatment that may be burdensome but are not intentionally so. Many people undoubtedly regard it as burdensome to pay their taxes, for instance, but presumably most do not regard this as a form of punishment. This is because although taxes may be foreseeably burdensome, they are not intentionally so. That is, the state does not levy taxes intending for them to be burdensome; rather, the intention is to pay for roads, an education system, and other public goods. That paying for these goods is burdensome to many taxpayers is incidental, and if there were a way to collect sufficient revenue to pay for needed public goods without this being a burden to taxpayers, then so much the better.

Punishment, however, is different. Punishment is intended to be burdensome. If it were not burdensome, then it would not be doing its job. For instance, as we will see below, some theorists contend that the aim of punishment is to reduce crime by deterring potential criminals. But for the threat of punishment to be the sort of thing likely to deter criminals, the punishment itself must be burdensome. Other theorists (retributivists) contend that wrongdoers deserve to suffer, and that punishment is justified as the infliction of this deserved suffering. Here again, the burdensomeness of punishment is not merely incidental, it is intended.

Of course, not all impositions of intended burdens count as punishment. A third commonly accepted feature of punishment is that it is imposed on someone guilty of an offense, as a response to that offense. Actually, there is some disagreement about this point. To count as punishment, must it be imposed on someone who is actually guilty of a crime? Or would it make sense to talk of punishing an innocent person (either mistakenly or intentionally)? Some scholars contend that punishment must be of a guilty person. Susan Dimock writes, “The innocent may be ‘victimized’ by the penal system, but they cannot be ‘punished’” (Dimock, 1997: 42). By contrast, H. L. A. Hart contends that we should acknowledge not only punishment of actual offenders, but also cases (which he calls “sub-standard or secondary”) of punishment “of persons…who neither are in fact nor supposed to be offenders” (see Hart, 1968: 5).

A fourth feature of punishment, widely acknowledged at least since the publication of Joel Feinberg’s seminal 1965 article “The Expressive Function of Punishment” is that it serves to express condemnation, or censure, of the offender for her offense. As Feinberg discusses, it is this condemning element that distinguishes punishment from what he calls “nonpunitive penalties” such as parking tickets, demotions, flunkings, and so forth. (Feinberg, 1965: 398-401). As we will see below, some scholars have taken this expression of censure to be central to the justification of punishment. But whether or not it plays a role in the justification of the practice, this expressive function is typically accepted as a distinctive feature of punishment.

Finally, it is worth highlighting that this article focuses on the legal institution of punishment—rather than, say, parents’ punishment of their children or other interpersonal cases of punishment (but see Zaibert, 2006). Legal theorists often assert as one of punishment’s features that it must be imposed by a properly constituted legal authority (typically, the state). They thereby aim to differentiate legal punishment from private vengeance or vigilantism. This does not mean we must accept uncritically that the state is the proper authority to impose punishment. Ideally, a full account of punishment should provide a plausible answer to why (or if) the state has an exclusive right to impose punishment.

These, then, are the most commonly cited features of punishment: punishment involves the state’s imposition of intended burdens—burdens that express social condemnation—on people (believed to be) guilty of crimes, in response to those crimes. This is not intended as a precise definition or a set of necessary and sufficient conditions for punishment. Theorists may disagree about particular elements, or especially about how exactly to flesh out the various elements. But this description is sufficient to give us a sense of why punishment stands in need of justification: It involves the state’s treating some of its members (imposing intentionally burdensome, censuring sanctions) in ways that typically would be morally impermissible.

2. Various Questions

When theorists ask whether punishment is justified, they typically assume a backdrop in which the legal system administering punishment is legitimate, and the criminal laws themselves are reasonably just. This is not to say that they assume that all legal systems are legitimate and all criminal laws are reasonably just in the actual world. Indeed, questions of political legitimacy and criminalization are important topics that have received a great deal of attention in their own right. But even in societies in which the legal system is legitimate and the laws are reasonably just, a general question arises of whether (and if so, why) it is permissible for the state to impose intended, censuring burdens on those who violate the laws.

This general question of punishment’s moral permissibility actually comprises a number of particular questions. A full normative account of punishment should provide answers to each of these questions.

First, there is the question of punishment’s function, or purpose. Put simply, what reason is there to want an institution of punishment? H. L. A. Hart referred to this as punishment’s “general justifying aim,” although this term may be misleading in two ways: on one hand, to say that the aim is justifying implies that it is sufficient, by itself, to establish punishment’s permissibility. As we will see, some scholars point out that more is needed to justify punishment than merely citing its function, no matter how valuable. On the other hand, talk of a justifying aim seems to privilege consequentialist accounts, according to which punishment is justified as a means to some socially valuable goal. But even for retributivist accounts, according to which punishment is justified not as a means to some end but rather as an intrinsically appropriate response to wrongdoing, we still need an explanation of why such a response is important enough to warrant the state’s institution of punishment. A first question, then, is what sufficiently important function punishment serves.

Even if we establish some sufficiently valuable function of punishment, this may not be enough to justify the practice. Some scholars contend that a crucial question is whether punishment violates the moral rights of those punished. If punishing offenders violates their rights, then it may be morally impermissible even if it serves some important function (Simmons, 1991; Wellman, 2009). What we need, according to this view, is an account of why, in principle, the practice of imposing intended burdens on people in the ways characteristic of punishment does not violate their moral rights.

In addition to justifying the practice of punishment in general, a complete account of punishment should also provide guidance in determining how to punish in particular cases. Even if the institution of punishment is morally permissible, a particular sentence may be impermissible if it is excessively harsh (or on some accounts, if it is too lenient). What principles and considerations should guide assessments of how severely to punish?

Relatedly, although this point has received less attention, we should ask not only about the appropriate severity of punishment but also about the proper mode of punishment. We may critique certain sentences not in virtue of their severity but because we believe the form of punishment (incarceration, capital punishment, and so forth) is in some sense inappropriate (Reiman, 1985; Moskos, 2011). What considerations, then, should guide assessments of whether imprisonment, fines, community service, probation, capital punishment, or some other form of punishment is the appropriate response to instances of criminal wrongdoing?

Finally, as mentioned, it is important to ask about the state’s role as the agent of punishment. Why is it the state’s right to impose punishment (if indeed it is)? Furthermore, what gives the state the exclusive right to punish (Wellman, 2009)? Why may victims not inflict punishment on their assailants (or hire someone to inflict the punishment)? Another question related to the proper agent of punishment—a question that has become increasingly salient in the decades following the Nuremberg trials—is when (if ever) the international community, rather than a particular state, can be the proper agent of punishment. What sorts of crime, and which criminals, are properly accountable to the institutions of international criminal law rather than (or perhaps in addition) to the domestic legal systems of particular states?

As we will see, various accounts of punishment focus on different questions. Also, some accounts seek to answer each of these questions by appealing to the same moral principles or considerations, whereas others appeal to different considerations in answering the different questions.

3. Consequentialist Accounts

Consequentialism holds that the rightness or wrongness of actions—or rules for action, or (relevant to our context) institutions—is determined solely by their consequences. Thus consequentialist accounts of punishment defend the practice as instrumentally valuable: the consequences of maintaining an institution of legal punishment, according to this view, are better than the consequences of not having such an institution. For many consequentialists, the burden of punishment itself is seen as a negative consequence—an “evil,” as Jeremy Bentham called it (Bentham, 1789: 158). Thus for punishment to be justified, it must be the case that it brings about other, sufficiently valuable consequences to outweigh its onerousness for the person on whom it is inflicted. Typically, punishment is defended as a necessary means to the socially valuable end of crime reduction, through deterrence, incapacitation, or offender reform.

a. Deterrence

Deterrence accounts contend that the threat of punishment serves as a disincentive for potential criminals. On such accounts, for the threat of punishment to be effective as a deterrent, it must be credible—it must have teeth, so to speak—and thus the legal system must follow through on the threat and impose punishment on those who violate laws. Theorists have distinguished two potential audiences for the deterrent threat: first, the threat of punishment might serve to dissuade members of the public generally from committing crimes that they might otherwise have committed. This is called general deterrence. Second, for those who do commit crimes and are subjected to punishment, the threat of future punishment (namely, the prospect of having to experience prison again, or pay further fines, and so forth) might provide a disincentive to reoffending. This is typically referred to as specific (or special) deterrence.

b. Incapacitation

Punishment might also help to reduce crime by incapacitating criminals. Unlike deterrence, incapacitation does not operate by dissuading potential offenders. Incapacitation instead aims to remove dangerous people from situations in which they could commit crimes. Imprisoning someone in a solitary confinement unit, for instance, may or may not convince her not to commit crimes in the future; but while she is locked up, she will be unable to commit (most) crimes.

c. Offender Reform

A third way in which punishment might help to reduce crime is by encouraging or facilitating offender reform. The aim of reform is like that of specific deterrence in one respect: both seek to induce a change in the offender’s behavior. That is, the aim for both is that she should choose not to reoffend. In this respect, both reform and specific deterrence differ from incapacitation, which is concerned with restricting rather than influencing offenders’ choices. But reform differs from specific deterrence in terms of the ways in which each seeks to induce different choices. Punishment aimed at specific deterrence provides prudential reasons: we impose onerous treatment on an offender in hopes that her aversion to undergoing such treatment again will convince her not to reoffend. Punishment with the aim of offender reform, by contrast, aims to reshape offenders’ moral motives and dispositions.

d. Sentencing

Each of these aims—deterrence, incapacitation, and reform—will have distinct implications with respect to sentencing. Punishment aimed at reducing crime through deterrence would in general need to be severe enough to provide members of the public with a significant incentive not to offend, or to provide offenders with an incentive not to reoffend. Also, as Bentham explained, the severity of sentences should reflect the relative seriousness of the crimes punished (Bentham, 1789: 168). More serious crimes should receive more severe punishments than do less serious crimes, so that prospective offenders, if they are going to commit one crime or the other, will have an incentive to choose the less serious crime.

For punishment aimed at reducing crime through incapacitation, sentences should be restrictive enough that dangerous offenders will be unable to victimize others (so, for instance, prison appears generally preferable to fines as a form of incapacitative punishment). In terms of duration, incapacitative sentences should last as long as the offender poses a genuine threat. Similarly, sentences aimed at reducing crime through offender reform should be tailored, in terms of the form, severity, and duration of punishment, in whatever ways are determined to be most conducive to this aim.

Finally, insofar as punishment itself is considered to be, in Bentham’s words, an “evil,” the consequentialist is committed to the view that sentences should be no more severe than is necessary to accomplish their aim. Thus whether she endorses deterrence, incapacitation, reform, or some other aim (or a combination of these), the consequentialist should also endorse a parsimony constraint on sentence severity (Tonry, 2011). After all, to impose sentences that are more severe than is necessary to accomplish punishment’s aim(s) would appear to be an infliction of gratuitous suffering—and so, from a consequentialist perspective, unjustified.

e. Objections and Responses

Typical consequentialist accounts of punishment contend that the practice is justified because it produces, on balance, positive consequences by helping to reduce crime, either through deterrence, incapacitation, or offender reform. Critics have objected to such consequentialist accounts on a number of grounds.

First, some have objected to deterrence accounts on grounds that punishment does not actually deter potential offenders. A key worry is that often (perhaps typically) those who commit crimes act impulsively or irrationally, rather than as efficient calculators of expected utility, and so they are not responsive to the threat of punishment. The question of whether punishment deters is an empirical one, and criminological studies on this question have come to different conclusions. In general, evidence seems to indicate that punishment does have some deterrent effect, but that the certainty of apprehension plays a greater deterrent role than does the severity of punishment (Nagin, 2013).

A similar line of objection has been raised against reform-based accounts of punishment. Criminological research in the 1970s led many scholars and practitioners to conclude that punishment did not, indeed could not, promote offender reform (the mantra “nothing works” was for many years ubiquitous in these discussions). More recent criminological work, however, has generated somewhat more optimism about the prospects for offender reform (Cullen, 2013).

Whereas critics have questioned whether punishment deters or facilitates offender reform, there is little doubt that punishment—especially incarceration—incapacitates (prisoners may still have opportunities to commit crimes, but their opportunities are at least significantly limited.) Critics have raised questions, however, about the link between incapacitation and crime reduction. For punishment to be justified on incapacitative grounds, after all, it would need to be the case not only that punishment in fact incapacitates, but that in so doing it helps to reduce crime. At least in some cases, there is reason to doubt whether the link between incapacitation and crime reduction holds. Most notably, locking up drug dealers or gang members does not appear to decrease drug- or gang-related crimes, because the incapacitated person is quickly and easily replaced by someone else (Tonry, 2006: 31-32).

Even if we accept, for argument’s sake, that punishment contributes to crime reduction, it still may not be justified on consequentialist grounds if it also generates costs that outweigh its benefits. The costs of punishment are not limited to the suffering or other burdens inflicted on offenders, although these burdens do matter from a consequentialist perspective. Scholars have also highlighted burdens associated with certain forms of punishment—in particular, incarceration—for offenders’ families and communities (Mauer and Chesney-Lind, 2002). These costs matter in consequentialist calculations. In addition, we must consider the financial costs of maintaining an institution of criminal punishment. In 2012, the Vera Institute of Justice released a study of 40 U.S. states that found that the total taxpayer cost of prisons in these states was $39 billion. Thus defenders of punishment on consequentialist grounds must show not only that punishment is beneficial, but also that its benefits are significant enough to outweigh its costs to offenders and to society generally.

Furthermore, even if punishment’s benefits outweigh its costs, consequentialists must make the case that these benefits cannot be achieved through some other, less burdensome response to crime. If there are alternatives to punishment that are equally effective in reducing crime but are less costly overall, then from a consequentialist perspective, these alternatives would be preferable (Boonin, 2008: 53, 264-67).

Suppose, however, that the benefits of punishment outweigh its harms and also that there are no alternatives to punishment that generate, on balance, better overall consequences. In this case, punishment would be justified from a consequentialist perspective. Many theorists, however, do not endorse consequentialism. Indeed, the most prominent philosophical objections to consequentialist accounts of punishment take aim specifically at supposed deficiencies of consequentialism itself.

Perhaps the most common objection to consequentialist accounts is that they are unable to provide principled grounds for ruling out punishment of the innocent. If there were ever a situation in which punishing an innocent person would promote the best consequences, then consequentialism appears committed to doing so. H. J. McCloskey imagines a case in which, in the wake of a heinous crime, a small-town sheriff must decide whether to frame and punish a person whom the townspeople believe to be guilty but the sheriff knows is innocent if doing so is the only way to prevent rioting by the townspeople (McCloskey, 1957: 468-69). If punishing the innocent person defuses the residents’ hostilities and prevents the riots—and thereby produces better overall consequences than continuing to search for the actual criminal—then it appears that the consequentialist is committed to punishing the innocent person. But knowingly punishing an innocent person strikes most of us as deeply unjust.

Consequentialists have responded to this objection in various ways. Some contend that what McCloskey describes is not actually punishment, because punishment, by definition, is a response to those guilty of crimes (or at least believed to be guilty, whereas in McCloskey’s example, the sheriff knows the person to be innocent). H. L. A. Hart refers to this response as the “definitional stop” and he suggests it is unhelpful because it seeks to define away the interesting normative questions. Setting terminology aside, the relevant questions are whether and why it is permissible to impose intended, condemnatory burdens on those (believed to be) guilty of crimes. The consequentialist’s response is that doing so produces the best consequences, but then it seems that the consequentialist should be committed to imposing such burdens on those not (believed to be) guilty of crimes when doing so produces the best consequences. Such a practice would strike many as morally wrong, however. Thus the objection arises for consequentialists regardless of definitions.

Others have responded to the objection that consequentialism would allow for punishing the innocent by suggesting that scenarios such as McCloskey suggests are so far-fetched that they are unlikely to occur in the real world. In actual cases, punishing the innocent will rarely, if ever, produce the best consequences. For instance, some contend that the sheriff in the example would likely be found out, and as a result the public would lose its trust in law enforcement officials; the long-term consequences, therefore, would be worse than if the sheriff had not punished the innocent person. As critics have pointed out, however, this response only shows that punishing the innocent will usually be ruled out by consequentialism. There might still be cases, albeit rare, in which punishing the innocent would generate the best consequences (maybe the sheriff is adept at covering up his act). At best, then, consequentialism seems only able to ground a contingent prohibition on punishing the innocent. Some consequentialists have accepted this implication, albeit reluctantly (see Smart, 1973: 69-73).

A similar objection to consequentialist accounts is that they cannot provide a principled basis for the widely held intuition that punishment should be no more severe than an offender deserves (where desert is the product of the seriousness of the offense and the offender’s culpability). On this view, it is morally wrong to subject those guilty of relatively minor crimes to harsh punishment; such punishment would be excessive. For consequentialist accounts, though, it appears that excessively harsh sentences would be permitted (indeed, required) if they produced the best overall consequences.

Jeremy Bentham contended that consequentialism does have the resources to ground relative proportionality in sentencing—that is, lesser offenses should receive less severe sentences than more serious offenses receive. His reasoning was that if sentences for minor offenses were as harsh as for more serious offenses, potential offenders would have no incentive to commit the lesser offense rather than the more serious one (Bentham, 1789: 168). If Bentham is right, then there is a consequentialist basis for punishing shoplifters, for instance, less harshly than armed robbers. But this does not rule out punishing shoplifters harshly (more harshly than most of us would think justified) and punishing armed robbers even more harshly; again, a consequentialist would seem committed to such a sentencing scheme if it promoted the best overall consequences.

Defenders of consequentialist sentencing have another response available, namely that excessively harsh sentences do not, in practice, produce the best consequences. For instance, criminological research suggests a) that stiffer sentences do not produce significant deterrent effects (it is primarily the certainty of punishment rather than its severity that deters); b) that extremely long prison terms are not justified on incapacitative grounds (for one reason, most offenders “age out” of criminal behavior anyway by their 30s or 40s); and c) that extremely harsh sentences may, on balance, have criminogenic effects (that is, they may make people more likely to reoffend). This sort of response, of course, makes the prohibition of disproportionate punishment a contingent matter; in other words, if extremely harsh sentences did help to reduce crime and this produced, on balance, the best overall consequences, then consequentialism would appear to endorse such sentences. Critics thus charge that consequentialist accounts are unappealing insofar as they are unable to ground more than a contingent prohibition on disproportionately harsh punishment.

Even if we prohibit punishment of the innocent or disproportionate punishment of the guilty, a third, Kantian objection holds that consequentialist punishment is not properly responsive to the person being punished. According to this objection, to punish offenders as a means to securing some valuable social end (namely, crime reduction) is to use them as mere means, rather than respecting them as ends in themselves (Kant, 1797: 473; Murphy, 1973).

In response to this objection, some scholars have contended that although consequentialists regard punishment as a means to an end, punishment does not treat offenders as mere means to this end. If we limit punishment to those who have been found guilty of crimes, then this treatment is arguably responsive to their choices and does not use them as mere means. Kant himself suggested that as long as we reserve punishment only for those found guilty of crimes, then it is permissible to punish with an eye toward potential benefits (Kant, 1797: 473).

A more recent objection to consequentialist systems of punishment, developed by R. A. Duff (1986, 2001), charges that consequentialist systems of punishment, with their focus on crime reduction, treat offenders as dangerous “outsiders”—as the “they” whom “we,” the law-abiding members of society, must threaten, incapacitate, or remold to ensure our safety. Such a conception of the criminal law is inappropriately exclusionary, Duff claims. The criminal law, and the institution of punishment, in a liberal polity should treat offenders inclusively, as (still) members of the community who despite having violated its values could, and should, nevertheless (re)commit to these values.

In response, one might object that systems of punishment aimed at crime reduction need not be exclusionary in the way Duff suggests. In particular, punishment that aims to deter crime might be said to treat all community members equally, namely as potential offenders. For those who have not committed crimes, deterrent punishment regards them as potential offenders and aims to provide an incentive not to offend (that is, general deterrence). For those who have committed crimes, deterrent punishment similarly regards them as potential (re)offenders and aims to provide an incentive not to (re)offend (that is, specific deterrence). In this way, punishment with a deterrent aim might be said to speak to all community members in the same terms, and thus not to be objectionably exclusionary.

4. Retributivist Accounts

As we have seen, consequentialist accounts of punishment are essentially forward-looking—punishment is said to be justified in virtue of the consequences it helps to produce. A different sort of account regards punishment as justified not because of what it brings about, but instead because it is an intrinsically appropriate response to crime. Accounts of the second sort have traditionally been described as retributivist. In general, we can say that retributivism views punishment as justified because it is deserved, although particular accounts differ about what exactly this means.

Theorists have distinguished two varieties of retributivism: positive retributivism and negative retributivism. Positive retributivism is typically characterized as the view that an offender’s desert provides a positive justifying reason for punishment; in other words, the state should punish those who are found guilty of criminal wrongdoing because they deserve it. Negative retributivism, by contrast, provides a constraint on punishment: punishment is justified only of those who deserve it. Because negative retributivism provides only a constraint on punishment, not a positive reason to punish, the negative retributive constraint has featured prominently in attempts at mixed accounts of punishment; such accounts allow punishment for consequentialist aims as long as the punishment is only of those who deserve it. On the other hand, because negative retributivism does not provide a positive justifying reason to punish, some scholars argue that it does not properly count as retributivism at all.

The distinction between retributivism and consequentialism is not always a neat one. Notice that one might endorse the claim that punishment is a deserved response to wrongdoing and then further assert that it is a valuable state of affairs when wrongdoers get the punishment they deserve—a state of affairs that therefore should be promoted. On this type of account, retribution itself essentially becomes the consequentialist aim of punishment (Moore, 1903; Zaibert, 2006). Nevertheless, in keeping with general practice, this article will treat retributivism as distinct from, and in competition with, consequentialist accounts.

a. Deserved Suffering

One common version of retributivism contends simply that wrongdoers deserve to suffer in proportion to their wrongdoing. Often this claim is made by way of appeal to intuitions about particular, usually heinous crimes: surely the unrepentant war criminal, for example, who has tortured and murdered many innocent people, deserves to suffer for what he has done. Proponents argue that retributivism is justified because it best accounts for our intuitions about particular cases such as these (Moore, 1987; Kleinig, 1973).

Justifying retributivism requires more, of course, than merely appealing to common intuitions about such cases. After all, even if many (even most) people do feel, in hearing reports of terrible crimes, that the perpetrators deserve to suffer, not everyone feels this way. And even those who do have such intuitions may not feel entirely comfortable with them. What we would like to know is whether the intuitions themselves are justified, or whether, for instance, they amount to an unhealthy desire for vengeance. Critics contend that those who rely on our intuitions about particular cases as evidence that retributivism is justified fail to provide the needed explanation of why the intuitions are justified.

There are other questions for such a view: does any sort of moral wrongdoing deserve to be met with suffering, or only some cases of wrongdoing? Which ones? And why is meting out deserved suffering for wrongdoing properly the concern of the state?

b. Fair Play

Another prominent type of retributivist account begins with a conception of society as a cooperative venture in which each member benefits when there is general compliance with the rules governing the venture. Because each of us benefits when everyone else plays by the rules, fairness dictates that we each have an obligation to reciprocate by playing by the rules, too. A criminal, like other members of society, benefits from general compliance with laws, but she fails to reciprocate by complying with the laws herself. She essentially becomes a free rider, because she counts on others to play by the rules that she violates. By failing to restrain herself appropriately, she gains an unfair advantage over others in society. The justification of punishment is that it corrects this unfair advantage by inflicting burdens on the offender proportionate to the benefit she gained by committing her crime (Morris, 1968).

On the fair play view, then, punishment is justified as a deserved response to an unfair advantage taken against members of society generally. Such an account offers a relatively straightforward answer to the question of why punishment is the state’s business. The state has an interest in assuring those who accept the burdens of compliance with the law that they will not be at a disadvantage to those who would free-ride on the system.

Critics of the fair play view have argued that it provides a counterintuitive conception of the crime to which punishment responds. It seems strange, for instance, to think of the wrong perpetrated by, say, a rapist as a sort of free-riding wrong against society in general, rather than an egregious wrong perpetrated against the victim in particular. In response to this charge, Dagger (1993) argues that crimes may be wrong in both senses: they may wrong particular victims in various ways, but they are also in every case wrongs in the sense of free riding on society generally.

c. Censure

Another influential version of retributivism begins with the claim, discussed earlier, that one of punishment’s distinctive features is that it communicates censure, or condemnation, of the offender for her offense. This retributivist account, developed most notably by R. A. Duff (1986, 2001), takes the censuring feature as the key to establishing punishment’s moral permissibility. Offenders deserve to be censured for what they have done, and punishment is justified because it delivers this censuring message.

Duff understands crimes as public wrongs, as violations of important public values. It follows on this account that the state is the appropriate agent of punishment; the state properly calls offenders to account for their violations of the political community’s shared values.

Censuring involves, in part, urging an offender to think about the wrong she has done, to repent and (re)commit herself to the values that she has violated. Thus it follows from censure accounts such as Duff’s that offender self-reform is an aim of punishment. But notice the crucial distinction between this sort of account and the variety of consequentialist account that aims at offender reform. Although offender reform is an aim of punishment on the censure account, it is not a justifying aim. In other words, on the censure view, punishment is not justified insofar as it tends to promote offender reform. Rather, punishment is justified because it communicates deserved censure. Part of what it means to censure, however, is to urge wrongdoers to repent and reform.

A common critique of the censure view asks why punishment—that is, the imposition of intended burdens—is the proper way to censure wrongdoers. It seems that the polity could communicate messages of censure to offenders without imposing intended burdens; for example, it could issue a public proclamation condemning the crime and blaming the offender. Why, then, is the hard treatment characteristic of punishment an appropriate vehicle for conveying such messages? One type of response, offered by Duff and others (see also Falls, 1987), holds that hard treatment is needed to convey adequately the polity’s condemnation of crimes. Nonpunitive censure—blaming without imposing intended hard treatment—would fail to communicate the seriousness of the wrongdoing.

Also, on Duff’s account, hard treatment can function to induce in offenders the sort of moral reflection that may lead to repentance, reform, and reconciliation (with their victims and the community more generally). Some have objected, however, that such an account implies too intrusive a role for the state. It is not a proper function of the state, critics charge, to seek to induce repentance and moral reform in offenders. Thus even some scholars who agree that punishment is justified as a form of censure nevertheless disagree about the role of the hard treatment element. For Andrew von Hirsch (1993), for instance, the intended burdens characteristic of punishment act as a sort of prudential supplement: punishment, as censure, serves to remind offenders (and community members) of the moral reasons to comply with the law. Punishment, as hard treatment, also provides a prudential threat as a sort of supplement for those of us for whom the moral message is not sufficient. One worry with such an account, however, is whether the prudential threat will tend to drown out the moral message.

d. Other Versions

Alternative versions of retributivism have been offered. Some scholars, for instance, argue that those who commit crimes violate the trust of their fellow community members. Trust, on this account, is an essential feature of a healthy community. Offenders undermine this trust when they victimize others. In such cases, punishment is a deserved response to such violations and an appropriate way to help maintain (or restore) the conditions of trust among community members (see Dimock, 1997). Advocates of this trust-based variety of retributivism must explain which violations of trust rise to the level that warrants criminalization, so that violators should be subject to punishment. Also, we might question whether such accounts are purely retributivist after all: if punishment is justified at least in part as a means of helping to maintain conditions of trust in a community, then this appears to be a consequentialist rationale. On the other hand, if punishment is justified not for what it helps to bring about but rather as an intrinsically appropriate (because deserved) response to violations of trust, then we need an explanation of why such violations deserve punishment, perhaps as opposed to some other form of response.

Another form of retributivism holds that offenders incur a moral debt to their victims, and so they deserve punishment as a way to repay this debt (McDermott, 2001). This moral debt is distinct from the material debt that an offender may incur. In other words, a person who robs from another person incurs a material debt equal to the value of whatever was stolen, but she also incurs a moral debt for violating the victim’s rights. The offender takes not only a material good from the victim but also a moral good. Repayment of material goods does not settle this moral debt, and so punishment is needed to fill this role. As Daniel McDermott characterizes it, punishment serves to deny the ill-gotten moral good to the perpetrator (McDermott, 2001: 424).

Such an account raises a host of questions: what precisely is the nature of the moral good that has been taken from the victim? How can a moral good be taken away from someone? In what sense (if at all) has the perpetrator gained this good? How does punishment deny this good to the offender, and how does this thereby make things right for the victim?

e. Sentencing

Because retributivism claims that punishment is justified as a deserved response to wrongdoing, retributivist accounts should provide some guidance about what sentences are deserved in particular cases. Typically, retributivists hold that sentences should be no more severe than is deserved. This negative retributivist constraint on sentencing corresponds with the negative retributivist constraint on punishment itself (namely, that punishment is justified only of those who deserve it). By contrast, positive retributivism holds that offenders’ sentences should be no less severe than they deserve. Some scholars find this positive retributivism unappealing because it seems to preclude the state from taking into account mercy or other considerations that might count in favor of lenient sentences. In other words, some are more comfortable with retributivism’s setting a ceiling but not a floor on sentence severity. One question, though, is whether (and if so, why) retributivists are justified in endorsing the negative retributivist constraint on sentencing without also endorsing the positive retributivist constraint.

Retributivists often discuss sentencing in terms of proportionality, where a proportionate sentence is understood as one that is deserved (or at least, on some accounts, not clearly undeserved). Sentences may be proportionate in two senses: first, they may be proportionate (or disproportionate) relative to each other. This sense of proportionality, called ordinal proportionality, holds that similarly serious offenses should receive similarly severe punishments (like cases should be treated alike); that more serious offenses should be punished more harshly than less serious offenses (murder should be punished more harshly than shoplifting, for instance); and that differences in sentence severity should reflect differences in relative seriousness of offenses (because murder is much more serious than shoplifting, murder should carry a much more severe sentence).

Some scholars have challenged the notion of ordinal proportionality constraints in sentencing, both because offenders cannot neatly be distinguished into a manageable number of desert-based groups—Michael Tonry calls this the “illusion of ‘like-situated offenders’” (Tonry, 2011)—and because individual offenders’ subjective experiences of the same sentence may vary greatly. For example, someone who is young, physically imposing, or has no children may have a much different experience of a 10-year prison term from someone who is much older, physically frail, or must leave behind her children to serve the sentence. Considerations such as these do not in themselves demonstrate that the tenets of ordinal proportionality are false (that like cases should not be treated alike, for instance, or that more serious violations should not receive harsher sentences). Rather, these considerations raise challenges to our ability in practice to implement a just sentencing scheme that reflects ordinal proportionality.

Even if sentences can be devised that satisfy ordinal proportionality, however—in other words, even if a sentencing scheme itself is internally proportionate—particular sentences may fail to be proportionate if the entire sentencing scheme is too severe (or lenient). For instance, a sentencing scheme in which even the least offenses were punished with prison terms would appear disproportionate even if sentences in the scheme were proportionate relative to each other. Thus theorists note a second sense of proportionality: cardinal, or nonrelative, proportionality. Cardinal proportionality considers whether sentences are commensurate with the crimes they punish. A prison term for jaywalking would appear to violate cardinal proportionality, because such a sentence strikes us as too severe given the offense, even if this sentence were proportionate with other sentences in a sentencing scheme—that is, even if it satisfied ordinal proportionality. Thus cardinal proportionality concerns not the relation of sentences to one another, but instead the relation of a sentence to the crime to which it is a response. Put another way, even if an entire sentencing scheme is internally (ordinally) proportionate, we need guidance in how to anchor the sentencing scheme to the crimes themselves so that offenders in particular cases receive the sentences they deserve.

In addition to addressing questions of deserved sentence severity, we would like retributivism to provide some guidance about how to determine what mode, or form, of punishment is appropriate in response to a given crime. Is prison time, community service, capital punishment, probation, or something else the deserved form of response, and why?

The implications of retributivism for sentencing will depend on the specific account’s explanation of why punishment is said to be the deserved response to offending.

Those who appeal to intuitions that the guilty deserve to suffer, for instance, can similarly appeal to intuitions that those who are guilty of more serious offenses deserve to suffer more than those who are guilty of less serious offenses. As discussed, however, we would like to know how much punishment is deserved in particular cases in nonrelative terms, and also what form the suffering should take. One well-known account of sentencing is provided by lex talionis (that is, an eye for an eye, a tooth for a tooth). Immanuel Kant famously endorsed this principle: “Accordingly, whatever undeserved evil you inflict upon another within the people, that you inflict upon yourself” (Kant, 1797: 473). As critics have noted, though, not every crime appears to have an obvious like-for-like response—what would lex talionis demand for the childless kidnapper, for instance (Shafer-Landau, 2000: 193)? And even when a like-for-like response is clearly indicated, it will not always be palatable (torturing the torturer, for example).

We might assert instead that the sentence and the offense need not be alike in kind, but that the sentence should impose an amount of suffering equal to the harm done by the offender. Still, questions arise of how to make interpersonal comparisons of suffering. And again, for the most heinous crimes, a principle of inflicting equal amounts of suffering may recommend sentences that we would find troubling.

The fair play view holds that punishment functions to remove an unfair advantage gained by an offender relative to members of society generally. Critics of this view often object, however, that it provides insufficient or counterintuitive guidance about sentencing. Put simply, there does not seem to be any advantage that an offender gains, in proportion with the seriousness of her crime, relative to community members generally. On one version of the view, the offender gains freedom from the burden of self-constraint that others accept in complying with the particular law that the offender violates. If so, then the sentence severity should be proportionate to the burden others feel in complying with that law. But compliance with laws is often not a burden for most citizens. Indeed, it is often less burdensome to comply with prohibitions on serious offenses (murder, assault, and so forth) than it is to comply with prohibitions on lesser crimes (tax evasion, jaywalking, and so forth), given that we are more often tempted to commit the lesser crimes. But if the unfair advantage that punishment aims to remove is freedom from the burden of self-constraint, and if self-constraint is often more burdensome with lesser crimes, then these less serious crimes will often appear to merit relatively more severe punishments. This is a violation of ordinal proportionality.

Similar problems arise for other versions of the fair play view. Suppose, for instance, that the unfair advantage a criminal gains is not freedom from the burden of complying with the particular law she violates, but rather freedom from complying with the rule of law in general. This general compliance, Richard Dagger writes, is a genuine burden: “there are times for almost all of us when we would like to have the best of both worlds—that is, the freedom we enjoy under the rule of law plus freedom from the burden of obeying laws” (Dagger, 1993: 483). Critics have objected, however, that on this conception of the unfair advantage all offenses become, for the purposes of punishment, the same offense. Both the murderer’s and the tax cheat’s unfair advantage is freedom from compliance with the rule of law generally. If the unfair advantage is the same, however, then removing the advantage would seem to require equal sentences. Again, such sentencing appears to violate ordinal proportionality.

For the censure view, questions arise about what form of punishment and what severity will communicate the deserved message of condemnation in particular cases. On such a view, the principles of ordinal proportionality appear to follow straightforwardly: censure should reflect the seriousness of the wrongdoing, and so if punishment is the vehicle of communicating censure, then sentences should reflect the appropriate relative degree of censure for each case.

The censure view should provide guidance not only about how severely to punish crimes relative to each other, but also how severely to punish in absolute terms, and also the appropriate mode of punishment. To say that manslaughter should be censured more severely than theft, for instance, does not actually tell us how severely to censure manslaughter or theft, or with what form of punishment. Again, the challenge is in determining how to anchor the sentencing scale to actual offenses. Should the least serious offenses receive censure in the form of a small fine, a day in jail, or a year in jail? Should the most serious offenses receive capital punishment, life imprisonment, or some less severe sentence?

Similar questions arise for accounts that characterize punishment as a deserved response to violations of trust, or as a deserved response to the incurrence of a moral debt. What form and severity of punishment is appropriate to maintain conditions of community trust in response to attempted kidnapping, or the theft of a valuable piece of art? How severe must a sentence be to resolve the moral debt that is incurred when one impersonates a police officer, or cheats on her taxes?

Indeed, questions about fixing deserved sentences in response to particular offenses arise for retributivist accounts generally. Critics have charged that retributivism is unable to provide adequate, nonarbitrary guidance about either the deserved severity or deserved form of punishment in particular cases (see Shafer-Landau, 2000).

Retributivists are, of course, aware of such objections and have sought to meet them in various ways. Nonetheless, questions about proportionate sentencing continue to be a central challenge for retributivist accounts.

5. Alternative Accounts

In part as a response to objections commonly raised against consequentialist or retributivist views, a number of theorists have sought to develop alternative accounts of punishment.

a. Rights Forfeiture

At the outset, we said that the central question of punishment’s permissibility is why (if at all) it is permissible to treat those who have committed criminal offenses in ways that typically would be impermissible. For some theorists, this question is best cast in terms of rights: why are the sorts of intended burdens characteristic of punishment, which would constitute rights violations if imposed on those who have not been convicted of criminal wrongdoing, not violations of the rights of those punished?

One way in which punishment would not violate the rights of offenders is if, in committing the crime for which they are convicted, they forfeit the relevant right(s). Because offenders forfeit their right not to be punished, the state has no corresponding duty not to punish them. As W. D. Ross writes, “the offender, by violating the life or liberty or property of another, has lost his own right to have his life, liberty, or property respected, so that the state has no prima facie duty to spare him, as it has a prima facie duty to spare the innocent” (1930: 60-61).

Notice that the forfeiture view itself does not imply any particular positive justification of punishment; it merely purports to explain why punishing offenders does not violate their rights. This is consistent with maintaining that the positive justification of punishment is that it helps reduce crime, or conversely, that wrongdoers deserve to be punished. Thus the forfeiture view does not provide a complete account of the justification of punishment. Proponents, however, take this feature to be a virtue rather than a weakness of the view.

The forfeiture claim raises a number of key questions: first, why does someone who violates the law thereby forfeit the right not to be punished? For those who are gripped by the dilemma of why punishing offenders does not violate their rights, the mere answer that offenders forfeit their rights, without some deeper account of what this forfeiture amounts to, may seem inadequate. Thus some theorists attempt to ground their forfeiture claim in a more comprehensive moral or political theory (see, for instance, Morris, 1991).

Second, what is the nature of the rights forfeited? Do offenders forfeit the same rights they violate? If so, then this raises some of the same challenges as we saw with certain forms of retributivism: what right is forfeited by a childless kidnapper, for example? Alternatively, is the forfeited right simply the right not to be punished? If every offender forfeits this same, general right, then on what basis can we distinguish what sentence is permissible for different offenders? For example, if the burglar forfeits the same right as the murderer, then what prevents us from imposing the same punishment in each case (could two offenders forfeit the same right to different degrees, as some have suggested)?

Third, how should we determine the duration of the forfeiture? Fourth, if an offender forfeits her right against punishment, then why does the state maintain an exclusive right to punish? Why are other individuals not permitted to punish?

b. Consent

Rights forfeiture theorists argue that punishment does not violate offenders’ rights because offenders forfeit the relevant rights. Another way that punishment might be said not to violate offenders’ rights is if offenders waive their rights. This is the central claim of the consent view. Defended most notably by C. S. Nino (1983), the consent view holds that when a person voluntarily commits a crime while knowing the consequences of doing so, she effectively consents to these consequences. In doing so, she waives her right not to be subject to punishment. This is not to say that she explicitly consents to being punished, but rather that by her voluntary action she tacitly consents to be subject to what she knows are the consequences.

Like the forfeiture view, the consent view does not supply a positive justification for punishment. To say that a person consents to some treatment does not by itself provide us with a reason to treat her that way. So the consent view, like the forfeiture view, is compatible with consequentialist aims or with the claim that punishment is a deserved response to offending.

One challenge for the consent view is that it does not seem to justify punishment of offenders who do not know that their acts are subject to punishment. For someone to have consented to be subject to certain consequences of an act, she must know of these consequences. What’s more, even if an offender knows she is committing a punishable act, she might not know the extent of the punishment to which she is subject. If so, then it is not clear how she can be said to consent to her punishment. It is not clear, for example, that a robber who knows that robbery is a punishable offense but does not realize the severity of the punishment to which she will be subject thereby consents to her sentence.

By contrast, other critics have charged that the consent view cannot rule out sentences that most of us would find excessive. This is because a person who voluntarily commits an action with knowledge of the legal consequences, whatever these consequences happen to be, has consented to be subject to the consequences. As Larry Alexander has put it: “If the law imposes capital punishment for overparking, then one who voluntarily overparks ‘consents’ to be executed” (Alexander, 1986).

Another difficulty for the consent view is that tacit consent typically can be overridden by explicit denials of consent. Thus it would seem to follow that one who tacitly consents to be subject to punishment could override this tacit consent by explicitly denying that she consents. But of course, we do not think that an offender should be able to avoid punishment by explicitly refusing to consent to it (Boonin, 2008).

c. Self-Defense

Another proposed justification of punishment conceives of punishment as a form of societal self-defense. First consider self-defense in the interpersonal context: When an assailant attacks me, he culpably creates a situation in which harm will occur: either harm to me if I do not effectively defend myself or harm to him if I do. In such a circumstance, I am justified in acting so that the harm falls on my attacker rather than on me. Similarly, when an offender creates a situation in which either she or her victim will be harmed, the state is permitted to use force to ensure that the harm falls on the perpetrator rather than on the victim (Montague, 1995).

So far, this view appears to justify state intervention only to stop ongoing crimes or ward off impending crimes. How does this view justify punishment as a response to past crimes? Advocates of the view claim that the state is not only justified in intervening to stop actual offenses; it is also permitted to threaten the use of force to deter such crimes. For the threat to be credible and thus effective as a deterrent, however, the state will need to follow through on the threat in cases in which offenders are not deterred. Thus punishment of offenders is permissible.

Notice that although the self-defense account views punishment as a deterrent threat, it is not a pure consequentialist account. Crucial to punishment’s permissibility on the self-defense view is the claim that an offender has culpably created the circumstance in which harm will fall either on the perpetrator or the victim. This backward-looking element is missing from pure consequentialist accounts that cite punishment’s deterrent effects in defending the practice.

Critics object that the analogy between self-defense and punishment breaks down in a number of respects. First, many self-defense theorists argue that the logic of defensive force permits the use of such force even against “innocent” threats. But we do not typically believe that, by analogy, punishment of innocent people is permitted, even if such punishment helped to maintain the credibility of a deterrent threat. Second, the degree of force that is permitted to stop an actual attack may far exceed what we intuitively believe would be permitted as punishment of an offense that has already been committed.

Third, it is one thing to follow through on a threat in order to deter the person who has just offended from offending again. It is another thing—and one might argue, more difficult to justify—to punish one person in order to maintain a credible deterrent threat against the public generally. If we believe the primary deterrent effect of punishment is as a general deterrent (rather than as a specific deterrent), then the analogy with typical accounts of self-defense seems strained. It would be as if, to deter the oncoming assailant from following through with his attack, I grab someone nearby (who has previously attacked me) and inflict the same degree of harm that I would aim to inflict on the assailant to defend myself. This might, of course, be permissible if my previous attacker had thereby acquired a duty to protect me from future harm by allowing himself to be punished as a means of maintaining a credible deterrent threat (Tadros, 2011).

d. Moral Education

The moral education view shares certain features of consequentialist accounts as well as retributivist accounts. On this view, punishment is justified as a means of teaching a moral lesson to those who commit crimes (and perhaps to community members more generally, as well).

Like standard consequentialist accounts, the education view acknowledges that part of the story of punishment’s justification involves its importance in reducing crime. But the education theorist also takes seriously the worry expressed by many retributivists that aiming to shape people’s behavior merely by issuing threats is, in G. W. F. Hegel’s words, “much the same as when one raises a cane against a dog; a man is not treated in accordance with his dignity and honour, but as a dog” (Hegel, 1821: 36). By contrast, a central feature of the moral education view is that those who commit crimes are moral agents, capable of reflecting on and responding to moral reasons. Thus moral education theorists view punishment not as a means of conditioning people to behave in certain ways, but rather of “teaching the wrongdoer that the action she did (or wants to do) is forbidden because it is morally wrong and should not be done for that reason” (Hampton, 1984).

Another way to express this difference between the education view and standard consequentialist views is that consequentialist views focus entirely on whether punishment promotes some goal. The education view, however, holds that only certain means are appropriate for pursuing this goal: namely, punishment aims to engage with the offender as a moral agent, to teach her that (and why) her behavior was morally wrong, so that she will reform herself. Thus we can even distinguish the education view from consequentialist accounts that aim at crime reduction through offender reform. For such consequentialist accounts, punishment’s justification is solely a matter of whether, on balance, it promotes these ends. The education view sets offender reform as an end, but it also grounds certain constraints on how we may appropriately pursue this end.

The education view, like the retributive censure view discussed earlier, views punishment as a communicative enterprise. Punishment communicates to offenders (indeed, to the community more generally) that what they have done is wrong. Thus on both accounts, punishment aims to encourage offenders to reform themselves. But whereas the retributive censure theorists view the message conveyed by punishment as justified insofar as it is deserved, education theorists contend that punishment is justified in virtue of what it aims to accomplish. In this respect, the education view sits more comfortably with standard consequentialist accounts than with retributivist views.

The education view conceives of punishment as aiming to confer a benefit on the offender, the benefit of moral education. This is not to say that punishment is not burdensome; as we have seen, its burdensomeness is an essential feature of punishment. But the burdens of punishment are intended to be ultimately beneficial. Thus education theorists roundly reject accounts according to which it is permissible (or even required) to inflict harm on those guilty of wrongdoing. Instead, education theorists hold, following Plato, that we should never do harm to anyone, even those who have wronged us.

Critics have raised various objections to the moral education view. Some are skeptical about whether punishment is the most effective means of moral education. Others point out that many (perhaps most) offenders are not apparently in need of moral education: many offenders realize they are doing something wrong but do so anyway. Even those who do not realize this as they are acting may recognize it soon afterward. Thus they do not seem to need moral education. Finally, some object that the education view is inappropriately paternalistic. According to the education view, after all, the state is justified in coercively restricting offenders’ liberties as a means to conferring a benefit (moral education) on them. Many liberal theorists are uncomfortable, however, with the idea that the state may coerce a person for her own benefit.

e. Hybrid Approaches

Finally, some theorists have responded to seemingly intractable disputes between consequentialists and retributivists by contending that the question of punishment’s permissibility is not actually a single question at all. Instead, establishing punishment’s permissibility involves answering a number of questions: questions about the aim of the practice, about its limits, and so on. Once we distinguish different questions that bear on punishment’s permissibility, we can then recognize that these questions may be answered by appeal to different moral considerations. What emerges is a hybrid account of punishment’s permissibility.

The most famous articulation of a hybrid view comes from H. L. A. Hart (1968), although there have been numerous attempts to develop such accounts both before and after Hart. The specifics of these accounts vary somewhat, but in general the point has been to distinguish the question of punishment’s aim (Hart called this the “general justifying aim”) from the question of how we must constrain our pursuit of that aim. The first question, about punishment’s aim, is usually answered according to consequentialist considerations, whereas the second question, about appropriate constraints, is typically answered by appeal to retributivist principles. In other words, if we are asking what reason could justify society in maintaining a system of punishment, the answer will appeal to punishment’s role in reducing crime, and thereby protecting the safety and security of community members. But if we ask how we may punish in particular cases, the answer will appeal to retributivist principles about proportionality and desert. Some have distinguished these questions in terms of the proper (consequentialist) rationale of legislators in criminalizing certain types of behaviors and the proper (retributivist) rationale of judges in imposing sentences on those who violate the criminal laws.

Although such views are sometimes described as “two-question” or “two-level” views, with the focus on consequentialist aims and retributivist constraints, there is no reason in principle why we should distinguish only two questions. As we saw earlier, punishment actually raises a host of specific normative questions, and so if we accept the general strategy of distinguishing questions and answering them by appeal to different considerations, then there is no reason in principle to stop with only a two-level hybrid theory. A hybrid view might offer distinct considerations in answer to a variety of questions: what is the positive aim of punishment? Does punishment violate offenders’ rights? How severely may we punish in particular cases? What mode of punishment is permissible in particular cases? And so on.

Also, although hybrid theories typically follow the pattern of aims and constraints, so that consequentialism provides the reason to have an institution of punishment and retributivism provides constraints on how we punish, there is no reason in principle why this could not be reversed. A hybrid theory might hold that suffering is an intrinsically appropriate (deserved) response to wrongdoing, but then endorse as a constraint, for example, that such retributive punishment should never tend to undermine offender reform.

Critics have charged hybrid accounts with being ad hoc and unstable. Although we can distinguish different questions related to punishment’s permissibility, it is a mistake to think that the answers to these questions are entirely independent of each other, so that we can answer each by appeal to entirely distinct considerations. For example, if we accept the consequentialist view that punishment’s general justifying aim is that it helps to deter crime, then why would considerations of deterrence not also play a role (even a decisive role) in how severely we punish in particular cases? Why should retributivist proportionality considerations govern in sentencing if these conflict with the pursuit of crime reduction through deterrence?

Retributivists, for their part, often argue that hybrid theories such as Hart’s, on which consequentialism supplies the justifying aim of punishment, relegate retributivism to a peripheral role. Retributivists, after all, tend to regard consequentialism as providing inappropriate reasons to punish. Characterizing retributivism’s role as providing constraints on the pursuit of consequentialist aims is thus unsatisfying to many retributivists.

6. Abolitionism

Some scholars are unpersuaded by any of the standardly articulated justifications of punishment. In fact, they conclude that punishment is morally unjustified, and thus that the practice should be abolished. An obvious question for abolitionists, of course, is what (if anything) should take the place of punishment. That is, how should society respond to those who behave in ways (committing tax fraud, burglary, assault, and so on) that currently are subject to punishment?

One option would be to endorse a model of treatment rather than punishment. On this model, an offender is viewed as manifesting some form of disease or pathology, and the appropriate response is thus to try to treat and cure the person rather than to punish her. Treatment differs from punishment, first, because it need not be burdensome. At least in principle, treatment could be pleasant. In practice, of course, treatment may often be burdensome—indeed, it may involve many of the same sorts of restrictions and burdens as we find with punishment. But even though courses of treatment may be burdensome, treatment does not typically convey the condemnation that is characteristic of punishment. After all, we generally think of those who are sick as warranting sympathy or concern, not condemnation.

Other options for abolitionists would be to endorse some model of restitutive or restorative, rather than criminal, justice. We might require that offenders make restitution to their victims, as defendants in civil lawsuits are often required to make restitution to plaintiffs (Boonin, 2008: 213-75). Or offenders might engage with victims in a process of restorative justice, one in which both offenders and victims play an active role, with aims of repairing the harms done and restoring the relationships that have been damaged (Braithwaite, 1999). Neither the restitutive nor the restorative models are centrally concerned with imposing intended, censuring burdens on offenders.

Not surprisingly, these alternative accounts are themselves subject to various objections. Critics of the treatment model, for instance, charge that it provides insufficient limits on what sort of treatment of offenders is permissible. The aim of “curing” diseased individuals might warrant quite severe treatment, both in scope and duration. Similarly, scholars have argued that the treatment model fails properly to respect offenders, as it treats them merely as patients rather than as moral agents who are responsible, and should be held responsible, for their actions (Morris, 1968).

Critics of the restitutive and restorative models may point out that some crimes do not clearly lend themselves to restitution or restoration: some crimes may seem so heinous that no victim restitution or restoration of relationships is possible. Other crimes do not have clearly specifiable victims. In addition, consequentialists may worry that practices of restitution or restoration may be inadequate as means of crime reduction if, for example, they are less effective than punishment at deterring potential offenders. Retributivists also may argue that something important is lost when we respond to wrongdoing solely with restitutive or restorative practices. Particularly for those who hold that an important function of punishment is to convey societal censure, restitution or restoration may seem inadequate as responses to crime insofar as they are not essentially concerned with censuring offenders. Alternatively, some retributivists argue that the restorative ideals can best be served by a system of retributive punishment (Duff, 2001; Bennett, 2008).

7. References and Further Reading

Alexander, Larry (1986). “Consent, Punishment, and Proportionality.” Philosophy & Public Affairs 15:2, 178-82.
Bennett, Christopher (2008). The Apology Ritual: A Philosophical Theory of Punishment. Cambridge, Cambridge University Press.
Bentham, Jeremy (1789). An Introduction to the Principles of Morals and Legislation. Reprinted in J. H. Burns and H. L. A. Hart (eds.), The Collected Works of Jeremy Bentham: An Introduction to the Principles of Morals and Legislation. Oxford, Clarendon Press, 1996.
Boonin, David (2008). The Problem of Punishment. New York, Cambridge University Press.
Braithwaite, John (1999). “Restorative Justice: Assessing Optimistic and Pessimistic Accounts.” Crime and Justice 25, 1-127.
Cullen, Francis T. (2013). “Rehabilitation: Beyond Nothing Works.” Crime and Justice 42:1, 299-376.
Dagger, Richard (1993). “Playing Fair with Punishment.” Ethics 103, 473-88.
Dimock, Susan (1997). “Retributivism and Trust.” Law and Philosophy 16:1, 37-62.
Dolovich, Sharon (2009). “Cruelty, Prison Conditions, and the Eighth Amendment.” New York University Law Review 84:4, 881-979.
Duff, R. A. (2001). Punishment, Communication, and Community. Oxford, Oxford University Press.
Duff, R. A. (1986). Trials and Punishments. Cambridge, Cambridge University Press.
Falls, M. Margaret (1987). “Retribution, Reciprocity, and Respect for Persons.” Law and Philosophy 6, 25-51.
Feinberg, Joel (1965). “The Expressive Function of Punishment.” Monist 49:3, 397-423.
Goldman, Alan (1979). “The Paradox of Punishment.” Philosophy & Public Affairs 9:1, 42-58.
Hampton, Jean (1984). “The Moral Education Theory of Punishment.” Philosophy & Public Affairs 13, 208-38.
Hart, H. L. A. (1968). Punishment and Responsibility: Essays in the Philosophy of Law. New York, Oxford University Press.
Hegel, G. W. F. (1821). Philosophy of Right. Trans. S. W. Dyde. Reprinted by Dover Philosophical Classics, 2005.
Henrichson, Christian, and Ruth Delaney (2012). The Price of Prisons: What Incarceration Costs Taxpayers. Report of the Vera Institute of Justice, Center on Sentencing and Corrections.
Kant, Immanuel (1797). The Metaphysics of Morals. In Immanuel Kant, Practical Philosophy, trans. and ed. Mary J. Gregor. Cambridge, Cambridge University Press, 1996.
Kleinig, John (1973). Punishment and Desert. The Hague, Martinus Nijhoff.
Lippke, Richard (2001). “Criminal Offenders and Right Forfeiture.” Journal of Social Philosophy 32:1, 78-89.
Mauer, Marc, and Meda Chesney-Lind (eds.) (2002). Invisible Punishment: The Collateral Consequences of Mass Imprisonment. The New Press, 2002.
McCloskey, H. J. (1957). “An Examination of Restricted Utilitarianism.” The Philosophical Review 66:4, 466-85.
McDermott, Daniel (2001). “The Permissibility of Punishment.” Law and Philosophy 20, 403-32.
Montague, Phillip (1995). Punishment as Societal Self-Defense. Lanham, Md., Rowman & Littlefield.
Moore, G. E. (1903). Principia Ethica. Cambridge, Cambridge University Press.
Moore, Michael S. (1987). “The moral worth of retribution.” In Ferdinand Schoeman (ed.), Responsibility, Character, and the Emotions: New Essays in Moral Psychology. Cambridge, Cambridge University Press.
Morris, Christopher (1991). “Punishment and Loss of Moral Standing.” Canadian Journal of Philosophy 21, 53-79.
Morris, Herbert (1968). “Persons and Punishment.” Monist 52, 475-501.
Moskos, Peter (2011). In Defense of Flogging. New York, Basic Books.
Murphy, Jeffrie G. (1973). “Marxism and Retribution.” Philosophy & Public Affairs 2:3, 217-43.
Nagin, Daniel S. (2013). “Deterrence in the Twenty-First Century.” Crime and Justice 42:1, 199-263.
Nino, C. S. (1983). “A Consensual Theory of Punishment.” Philosophy & Public Affairs 12:4, 289-306.
Plato (1997). Crito. In Plato: Complete Works Indianapolis, Hackett Publishing Company, Inc.
Reiman, Jeffrey H. (1985). “Justice, Civilization, and the Death Penalty: Answering van den Haag.” Philosophy & Public Affairs 14:2, 115-48.
Ross, W. D. (1930). The Right and the Good. Oxford, Oxford University Press.
Shafer-Landau, Russ (2000). “Retributivism and Desert.” Pacific Philosophical Quarterly 81, 189-214.
Simmons, John A. (1991). “Locke and the Right to Punish.” Philosophy & Public Affairs 20:4, 311-49.
Smart, J. J. C. (1973). “An outline of a system of utilitarian ethics.” In J. J. C. Smart and Bernard Williams (eds.), Utilitarianism: For and Against. Cambridge, Cambridge University Press.
Tadros, Victor (2011). The Ends of Harm: The Moral Foundations of Criminal Law. Oxford, Oxford University Press.
Tonry, Michael (2011). “Proportionality, Parsimony, and Interchangeability of Punishments.” In Michael Tonry (ed.), Why Punish? How Much? A Reader on Punishment. Oxford, Oxford University Press.
Tonry, Michael (2006). “Purposes and Functions of Sentencing.” Crime and Justice 34:1, 1-52.
Von Hirsch, Andrew (1993). Censure and Sanctions. Oxford, Oxford University Press.
Wellman, Christopher Heath (2009). “Rights and State Punishment.” Journal of Philosophy 106:8, 419-39.
Zaibert, Leo (2006). Punishment and Retribution. Aldershot, U.K., Ashgate.

Author Information

Zachary Hoskins
Email: zachary.hoskins@nottingham.ac.uk
University of Nottingham
United Kingdom

Intentionality

If I think about a piano, something in my thought picks out a piano. If I talk about cigars, something in my speech refers to cigars. This feature of thoughts and words, whereby they pick out, refer to, or are about things, is intentionality. In a word, intentionality is aboutness.

Many mental states exhibit intentionality. If I believe that the weather is rainy today, this belief of mine is about today’s weather—that it is rainy. Desires are similarly directed at, or about things: if I desire a mosquito to buzz off, my desire is directed at the mosquito, and the possibility that it depart. Imaginings seem to be directed at particular imaginary scenarios, while regrets are directed at events or objects in the past, as are memories. And perceptions seem to be, similarly, directed at or about the objects we perceptually encounter in our environment. We call mental states that are directed at things in this way ‘intentional states’.

The major role played by intentionality in affairs of the mind led Brentano (1884) to regard intentionality as “the mark of the mental”; a necessary and sufficient condition for mentality. But some non-mental phenomena seem to display intentionality too—pictures, signposts, and words, for example. Nevertheless, the intentionality of these phenomena seems to be derived from the intentionality of the mind that produces them. A sound is only a word if it has been conferred with meaning by the intentions of a speaker or perhaps a community of speakers; while a painting, however abstract, seems only to have a subject matter insofar as its painter intends it to. Whether or not all mental phenomena are intentional, then, it certainly seems to be the case that all intentional phenomena are mental in origin.

The root of the word ‘intentionality’ reflects the notion that it expresses, deriving from the Latin intentio, meaning ‘directed at’. Intentionality has been studied since antiquity and has generated numerous debates that can be broadly categorized into three areas that are discussed in the following sections:

Section 1 concerns the intentional relation: the relation between intentional states and their objects. Here we aim to answer the question “What determines why any given intentional state is about one thing and not another?” For example, what makes a thought about a sheep about that sheep? Does the thought look like the sheep? Or does it perhaps have a causal origin in an encounter with the sheep?

Section 2 explores the nature of the objects of intentional states. Are these objects independent of us, or somehow constituted by the nature of our minds? Do they have to exist, or can we have thoughts about non-existent objects like The Grinch?

Section 3 explores the nature of intentional states themselves. For example, are intentional states essentially rational states, such that only rational creatures can have them? Or might intentional states be necessarily conscious states? And is it possible to give a naturalized theory of intentionality that appeals only to facts describable in the natural sciences?

This article explores these questions, and the dominant theories that have been designed to answer them.

The Intentional Relation
1. Formal Theories of Intentionality
2. Problems for Forms, and the Causal Alternative
Intentional Objects
Intentional States
References and Further Reading

1. The Intentional Relation

If I am thinking about horses, what is it about my thought that makes it about horses and not, say, sheep? That is, in what relation do intentional states stand to their objects? This is the question “What is the intentional relation?” There have been many answers proposed to this question, and a broad division can be discerned in the history of philosophy between what can be called ‘formal’ and ‘causal’ theories.

a. Formal Theories of Intentionality

One answer to the question is that mental states refer to the things they do because of the intrinsic features of those mental states. The earliest version of this theory is based on Plato’s theory of forms. Plato held that apart from the matter (hyle) they are composed of, all things have another aspect, which he called their ‘form’ (morphê). All horses, for example, although individually made of different material, have something in common – and this is their form. The exact meaning of Plato’s ‘form’ is a controversial issue. On one reading, two things have the same form or are ‘conformal’ if they share the same shape; on a broader interpretation, two things are conformal if there is a one-to-one mapping between the essential features of the two—as there is between a building and an architect’s blueprint for the building. Plato held that when we think about an object, we have the form of the object in our mind, so that our thought literally shares the form of the object. Aristotle further developed this theory, arguing that in perception (sensu) the form of an object perceived is transmitted from the object to the mind of the perceiver. In the Middle Ages Thomas Aquinas defended and elaborated Aristotle’s theory, and in the Early Modern period the theory finds an heir in the work of the ‘British Empiricists’ Locke and Hume. Locke and Hume argued that ‘ideas’, which they considered to be the fundamental components of thought, refer to their objects because they are images of those objects, impressed on the mind through the action of the perceptual faculties.

Although images or shapes may play a role in thought, it is generally accepted that they cannot provide a complete account of intentionality. The relation between an image and its object is a relation of resemblance. But this presents a difficulty that was first raised against the formal theory by Ockham in the Middle Ages (King, 2007). The problem is that the relation of resemblance is ambiguous in a way that the intentional relation cannot be. An image of a man walking up a hill also resembles a man walking backwards down a hill (Wittgenstein, 1953), whereas a thought about a man walking up a hill is not also a thought about a man walking backwards down a hill. Similarly, while an image of Mahatma Gandhi resembles Mahatma Gandhi, it also resembles everyone who resembles Mahatma Gandhi (Goodman, 1976). Thoughts about Mahatma Gandhi on the other hand, are not thoughts about anyone who looks like Mahatma Gandhi.

An alternative formal model that seems to avoid this problem appeals to descriptions (Frege 1892, Russell 1912). This view holds that if I am thinking about something, then I must have in mind a description that uniquely identifies that thing. Descriptions seem to avoid the problem of ambiguity faced by images. There may be many people who resemble Mahatma Gandhi, but probably only one person that satisfies the description ‘the Indian Nationalist leader assassinated on the 30^th of January 1948’. Since the ‘descriptivist’ account takes concepts to refer to their objects by describing them, so that the features of a concept somehow correspond to the features of its object, the descriptivist theory is arguably also a formal theory of intentionality.

In addition to answering the question why an intentional state refers to one object and not another, the formal approach is also helpful in explaining how thinkers understand what it is they are thinking about. One thing that we seem to be able to do when we have mental states that are directed at particular things objects is to reflect upon different aspects of those objects, reason about them, describe them, and even make reliable predictions about them. For example, if I understand what horses are, and what sheep are, I ought to be in a position to tell you about their differences, and perhaps make good predictions about their behavior. If intentional states are conformal with their objects, we have some explanation for how such understanding is possible, since the form of the object the intentional state is directed at should be available to me if I reflect upon my own thoughts.

And we have another reason still for expecting that thoughts have a formal component. Frege (1892) observed that we can have multiple thoughts about the same thing, without realizing that we are thinking of the same thing in each case. The Ancient Greeks believed that Hesperus and Phosphorus (two Greek names for Venus) were two different stars in the sky, one of which appeared in the morning, while the other appeared in the evening. As a result they believed that Hesperus rises in the evening while simultaneously believing that Phosphorus does not. Of course Hesperus and Phosphorus, as it turns out, are the same object – the planet Venus, which rises both in the morning and in the evening. And so the Ancient Greeks had two contradictory beliefs about Venus, without realizing that both beliefs were about the same thing. The upshot is that it is possible for us to have distinct concepts that pick out the same thing without our knowing.

Frege proposed as an explanation that our concepts must vary in more ways than in what they refer to. They also vary, he proposed, in what he called their ‘sense’, so that two concepts could refer to the same object while differing in sense. He described the sense as the ‘mode of presentation’ of the object that a concept picks out. It would appear that by ‘mode of presentation’ he meant something like a description of the object. So, while the reference of someone’s hesperus and phosphorus concepts might be the same, the sense of hesperus might be ‘the star that appears in the evening’, while the sense of phosphorus could be ‘the star that appears in the morning’. Since it is perfectly rational to suppose that the object that satisfies the description ‘the star that appears in the morning’ might not be the same as the object that satisfies the description ‘the star that appears in the evening’, we now have an explanation for how one could have two concepts that pick out the same thing without knowing.

Supposing that the intentional relation is one of conformality, then, allows us to explain i) why a thought refers to what it does, (ii) how we can have introspective knowledge of the things we think about, and (iii) how two or more of our concepts could pick out the same thing without our knowing. But there are problems facing the formal approach, which have lead many to look for alternatives.

b. Problems for Forms, and the Causal Alternative

The formal theory of intentionality faces two major objections.

The first objection, sometimes called ‘the problem of ignorance and error’, is that the descriptions we have at our disposal of the objects we think about might be insufficient to uniquely identify those objects. Putnam (1975) articulated this objection using a now famous thought-experiment. Suppose that you are thinking of water. If the descriptive theory is right, for example, you must have at your disposal a description that uniquely distinguishes water from all other things. For most of us – chemists aside – such a description will amount to something like ‘the clear drinkable liquid in the rivers, lakes, and taps around here’. But suppose, suggests Putnam, that there is another planet far away from here, which looks to its inhabitants just like Earth looks to us. On that planet, let’s call it Twin-Earth, there is a clear drinkable liquid that the inhabitants of the planet refer to (coincidentally) as ‘water’, but that is in fact a different chemical substance; rather than H₂O, it has a different chemical composition—let’s call it XYZ. If this were true, we should expect that the description most people here on Earth are in a position to give of what we call ‘water’ will be just the same as the description the inhabitants of the other planet give of what they call ‘water’. But, by hypothesis, when we think about water we are thinking of the substance on our planet, H₂O, and when they think of what they call ‘water’, they are thinking of a different thing—XYZ. As a result, it would seem that descriptions are not sufficient to explain what we are thinking of, since a member of either of these groups will give the same description for what they call ‘water’, even though their thoughts pick out different substances. This is the ‘ignorance’ part of the problem—we often don’t have enough descriptive knowledge of the things we think about to uniquely identify those things. The ‘error’ part is that it often turns out to be the case that our beliefs about the things we think about are false. For example, many people believe tomatoes are vegetables not fruit; and as a result, the description they will give of ‘tomato’ will include the claim that tomatoes are vegetables. If these people are indeed thinking of tomatoes, so the argument goes, it cannot be as a result of their being in possession of a description that picks out tomatoes, since no tomato truly falls under the description ‘fruit’.

The second difficulty for the formal accounts, specifically directed at the descriptive account, is that descriptions do not identify the essential nature of the things they pick out, whereas many words and concepts do (Searle 1958, Kripke 1980). The description someone might offer of Hesperus could be ‘the brightest celestial object in the evening sky’. But it is perfectly coherent to suppose that Hesperus could have existed without having been visible in the evening. It could have drifted into a different orbital pattern, or have been occluded by a belt of asteroids, and therefore never have been visible in the evening. This description does not, therefore, capture an essential feature of Hesperus. The term ‘Hesperus’ in our thoughts, on the other hand, does pick out an essential feature of Hesperus—being Hesperus. That this is an important difference can be seen when we realize that concepts and descriptions seem to behave differently in thoughts about counterfactual possibilities—or, alternative ways the world could have turned out. For example, the thought ‘Hesperus could have failed to have been the brightest celestial object in the evening sky’, is clearly true—this could have been the case had it drifted into a different orbital pattern. But the thought ‘Hesperus could have failed to have been Hesperus’, is not true: there is no way the world could have turned out such that Hesperus could have failed to have been itself. The name ‘Hesperus’ therefore identifies the essence of Hesperus—what it couldn’t fail to be; but the description does not. So now we have a further reason for thinking that concepts are not cognitively equivalent to descriptions—since they behave differently in thoughts about counterfactual possibility.

As an alternative to descriptions, images, or forms of any sort, Putnam (1975) and Kripke (1980) propose a ‘causal’ model of intentionality. On this alternative model, our concepts do not have intrinsic formal features that determine what they refer to. Rather, a concept picks out the thing that originally caused it to occur in the mind of a thinker, or the thing it is causally related to in the mind-independent world. On this view, if I have a concept that picks out horses, this concept must have initially been caused to occur in me by a physical encounter with horses. If I have a concept that picks out water, the concept must have been caused to occur in me by a causal interaction with water. And if I have a concept that picks out Hesperus, this concept must have a causal origin in my apprehension of Hesperus, perhaps by seeing it in the sky.

We can see how the causal theory can be used to address the two major objections to the formal theory. Firstly, on the causal account, the ‘water’ thoughts of those on Earth can be distinguished from the ‘water’ thoughts of those on Twin-Earth: the substance Earthlings are causally interacting with when they have ‘water’ thoughts is H₂O, while the substance that Twin-Earthlings are causally interacting with is XYZ—explaining why the thoughts of each thinker refer to different things, even though the descriptions they might offer of those things are identical. Similarly, I can causally interact with water, or tomatoes, even if I have false beliefs about these things, so the causal model allows that the descriptions I might offer of the things I think about can be false. The causal model therefore seems to handle the problem of ignorance and error. Secondly, if we reject that my hesperus concept is cognitively equivalent to a description, the worry that the description fails to identify the essence of the object simply doesn’t arise. The causal model therefore also seems to handle the problem concerning reference to essential properties (sometimes called the ‘modal problem’).

However, the causal model has trouble explaining some of the things the formal model was designed to explain (see last paragraph of Section 1a above). Firstly, the causal model has trouble explaining (ii), how we can reflect on the objects of our thoughts, and say something about them. If concepts have no formal component that somehow describes their objects this becomes mysterious. The causal model also fails to explain (iii), how we can have multiple thoughts about the same thing without realizing. While formal models can explain this by holding that different concepts can be cognitively equivalent to different descriptions of the same thing, the causal model has trouble explaining this. Since the thoughts of an Ancient Greek about hesperus, and the thoughts of an Ancient Greek about phosphorus have a causal origin in the same object, namely Venus, the causal relation that stands between these concepts and their object is identical in each case; as a result, there ought to be no difference between the concepts on the causal model.

The formal and causal models therefore each provide good explanations for one set of phenomena, but run into trouble in explaining another.

Perhaps the best account of the intentional relation will be one that draws on aspects of both theories—something that so-called ‘two-dimensional’ accounts of intentionality aim to do (Chalmers 1996, 2006, Lewis 1997, Jackson 1998). On this approach, although it is necessary to know what environment a thinker is causally connected to in order to know what her thoughts refer to, this need not rule out that her concepts also have a formal component. The trick is to find a formal component that does not run into the problems raised by the causal theorist. To deal with the problem of error, for example, it has been proposed that the formal component of a concept might be a description of the appearance of the object the concept refers to (Searle 1983). Although I can be wrong that the things my tomato concept picks out are vegetables, it would seem that I cannot be mistaken that they are apparently red shiny edible objects—since I cannot be wrong about how the world appears to me. Such content would therefore avoid the problem of error—these descriptions couldn’t turn out to be false. To deal with the problem of ignorance, where my descriptive knowledge fails to uniquely determine which thing I am thinking of, it has been proposed to write the causal origin of my experience into the formal component. So, my concept water might be cognitively equivalent not just to ‘the apparently clear drinkable liquid in the lakes and rivers’, which fails to distinguish the water on Earth from the water on Twin-Earth, but to ‘the stuff causing my current experiences of an apparently drinkable liquid in the lakes and rivers’ (Searle 1983). This description, it would seem, does indeed distinguish water from Twin-Earth water, since only water is the causal source of my experiences (because I am on Earth, not Twin-Earth). And to get descriptions to behave the same way as concepts in thoughts about counterfactual possibility, it has been proposed to include the specification ‘actual’ in the descriptive content of a concept (Davies and Humberstone 1980). Although it is true that ‘the brightest celestial object in the evening sky could have failed to have been Hesperus’, it seems not to be true that ‘the actual thing that is the brightest celestial object in the evening sky could have failed to have been Hesperus’. By including ‘actual’ in the description, we can therefore get the description to behave in the same way as the concept in counterfactual thoughts. In sum, the descriptive content of a concept like water would be something like ‘the actual stuff causing my experience of an apparently clearly drinkable liquid in the lakes and rivers’. Such content, it is hoped, can account for the phenomena formal models explain without running into the difficulties faced by earlier formal accounts. Whether these modifications really succeed in handling the problems raised by the causal theorist is, however, a topic of ongoing controversy (see Soames 2001, 2005 and Recanati 2013 for recent defenses of the causal approach; see Chalmers 2006 for a defense of the two-dimensional approach, and an advanced overview of the debate).

2. Intentional Objects

Having seen some of the layout of the debate about what determines the object of any intentional state, we can now consider issues that arise when we consider the objects themselves. Do they all have something in common that makes them appropriate as objects of intentional states? Might there be non-existent intentional objects? Do our thoughts connect directly with these objects or only indirectly, via our senses?

a. Intentional Inexistence

Franz Brentano has been mentioned already in this article, in part because his work set the tone for much of the debate over intentionality in the 20^th century. One of his claims was that the objects of intentional states have a special type of existence, which he called ‘intentional inexistence’. Whether he meant by that a special sort of existence ‘in’ the intentional, or that intentional objects do not exist, is debated. Supposing that intentionality is always directed at objects that do not exist, however, is particularly problematic, and we’ll look at the difficulties it raises in the next section. So first I’ll explore the possibility that Brentano supposed that intentional objects have a special sort of existence as objects of intentional states.

This idea had a particularly strong influence on the work of Edmund Husserl, who founded a branch of philosophy of mind known as phenomenology, which he conceived of as the study of experience. Husserl emphasizes that the objects of thought have a particular character insofar as they are objects of thought. First, they have to be related to other concepts and ideas in the mind of the thinker in a coherent way, a feature he refers to as their ‘noematic’ character. If our ideas of the objects we encounter in experience conflict too severely with the constraints that our understanding of how the world works, those ideas will disintegrate (something he calls ‘noematic explosion’). Visual illusions present a good example of this. If we are presented with an object that appears to be a cube sitting on a flat surface, we will approach the object with certain expectations, for example that if we turn our heads to one side we will see the side of the cube now out of view, if we grab a hold of it our grasp will be resisted, and so on. If the object turns out to be an image painted in such a way that it only appears as a cube from a certain angle, when we discover this by trying to pick it up, for example, the idea we are working with of the object will disintegrate. It is in this sense that Husserl at least took the objects of thought to have a special sort of existence as objects of thought (Føllesdal 1992, Mooney 2010).

Husserl (1900) proposed that we can study the nature of the constraints that the character of our mind places on the possible objects of thought through a method he calls ‘phenomenological reduction’, which involves uncovering the conditions of our awareness of objects through reflection on the nature of experience. The approach inherits a great deal from Kant’s transcendental idealism, since in both cases we are required to recognize that the nature of our minds may impose a very specific character on objects as we encounter them in experience – a character that we should not be tempted to assume is imposed on our experience by facts about the external world. The idea that the nature of our minds imposes constraints on the way we experience the world is in fact a claim that is increasingly widely accepted, and phenomenology has become an area of particular interest for the emerging field of cognitive science (see for example Varela, Thompson and Rosch 1991).

b. Thinking About Things that Do Not Exist

The second possible interpretation of Brentano’s claim – that intentional objects do not exist – is particularly problematic. Whether or not all objects of thought are non-existent, it certainly seems that many are, including those that are obviously fictitious (The Grinch, Sherlock Holmes) or likely non-existent even if many people believe in them (Faeries, Hell). But deep puzzles arise when we consider what it means to say something about a non-existent object. Can we, for example, coherently state that Santa Claus has flying reindeer? If he does not exist, how can it be true that he has flying reindeer? Can we indeed even coherently state that Santa Claus does not exist? If he does, our statement is false. But if he does not exist, then it seems that our claim is not about anything – and hence apparently meaningless. Another way of putting the puzzle involves definite descriptions. It seems reasonable to say the following:

(1) The fairy king does not exist

But upon further consideration 1) is quite puzzling, because the appearance of the definite article ‘the’ in that statement seems to presuppose that there is such a thing as the fairy king to which we refer.

Russell proposed a famous solution to this puzzle. It involves first analyzing definite descriptions to show how we can use these to express claims about things that do not exist, and second to show that most terms that we use to make negative existential claims are actually definite descriptions in disguise. The first move is accomplished by Russell’s analysis of the logical structure of definite descriptions. He takes definite descriptions to have the logical form ‘a unique thing that has the properties F and G’. So, the definite description ‘the fairy king’ in 1) on Russell’s reading is logically equivalent to the description ‘a unique thing that is both a king and a fairy’. Notably, this eliminates the term ‘the’ from the description, and with it the presupposition that there is a fairy king. And rather than being meaningless, the claim that such a thing does not exist is true, if no unique thing exists that is both a king and a fairy:

(2) There is no unique thing that is a king and a fairy

And, of course, false if there is a unique thing that is a king and a fairy. The second step of Russell’s solution is to hold that most referring terms in ordinary language are actually disguised definite descriptions. The term ‘Santa Claus’ on this view is actually a sort of shorthand for a description, perhaps ‘the man with the flying reindeer’. And this description is in turn to be analyzed as Russell proposes, so that the claim ‘Santa Claus does not exist’ in fact amounts to the denial that a unique individual that has the properties of being a man and having flying reindeer exists. And that seems to be perfectly coherent.

Are there any terms, in language or thought, on this account, that are not descriptions? Russell’s view is that the simplest terms in thought, out of which definite descriptions are composed, are not descriptions but singular terms, whose meaning is simply the object they refer to. These are demonstrative terms like ‘that’ and ‘this’, and our concepts of sensible properties like colors, sounds and smells. The meaning of these terms are fixed by what Russell called ‘acquaintance’ – they are conferred with meaning as a result of a direct interaction between the thinker and thing referred to, for example when we point at a color and simply think to ourselves ‘that’. These terms are only meaningful if in fact there are objects in the world to which they refer. Notice that on this view the second interpretation of Brentano’s claim – that in general the objects of thought do not exist – will become impossible to maintain. Since the descriptions that can pick out non-existent objects are composed of terms that are only meaningful if they refer to existing things, the objects of at least singular terms must exist for the view to make any sense.

c. Direct versus Indirect Intentionality

Even supposing that many objects of thought do exist, a further question arises as to whether the objects that we encounter in experience are products of our minds, or mind-independent objects. The view that the objects of experience are mind-dependent can be motivated by two complementary considerations. First, it seems reasonable to suppose that two different persons’ experiences in the same environment can be different. A color-blind person and a person with perfect color vision might have visually very different experiences in the same environment. Conversely, it seems that one person’s experiences in two very different environments could be the same. When I look at an oasis in the desert, I have a visual experience that might seem to be identical to the experience I have when faced with a mirage, even though these two environments are very different.

These considerations have lead many to argue that our experiences – even those of ordinary objects – are mediated by what have been called ‘sense data’. According to the sense-data theorist, what we immediately experience are not mind-independent objects, but sense-data that are produced at least partly by our minds. This allows us to explain the two puzzles considered above. If what we encounter in experience are sense data and not mind-independent objects, then two people could have very different experiences in the same mind-independent environment, and correlatively, one person could have two indistinguishable experiences in two very different mind-independent environments. Note that these sense-data may correspond very closely to the way things stand in the mind-independent world around us, so the view need not imply that our interactions with the world should be dysfunctional.

This ‘indirect’ theory of perception, however, raises worries about our knowledge of the world. When we say of the ketchup before us that it is red, are we saying this about the ketchup, or about the sense-data that we experience as a result of looking at the ketchup? If we really only experience the sense-data, this would suggest that most of the beliefs we have about the world around us are false. We believe our intentional states are directed at mind-independent objects, but the indirect theory suggests that they are not. We believe we’ve seen red ketchup, but this theory suggests that in fact we’ve only seen sense-data of red ketchup. And if we only have experience of sense-data produced by our minds, this seems to imply that we have never really had any direct experience with the world. It suggests that we’ve never seen waterfalls, smelled flowers, or heard the voices of our friends, but have only experienced sense-data of these things.

An early reply to these concerns involves jettisoning the indirect-theory of perception, and adopting the view that there are no sense-data or any other kind of representations mediating our experiences of the objects around us – a view sometimes called ‘naive realism’, and associated with Moore (1903). But on this approach, explanations of hallucinations or variations between different individuals’ experiences of the same objects are strained. An interesting middle ground is known as disjunctivism (Hinton 1967, Snowdon 1981, McDowell 1994, Martin 2002). The disjunctivist holds that the argument for the indirect theory of perception based on hallucinations is fallacious. Although the experiences of the oasis and the mirage might well be indistinguishable for the subject of the experience, this need not imply that the experiences are really the same. Rather, since one experience is the product of an encounter with an oasis, and the other is not, there is a difference between the experiences—it is just one that the subject is unable to identify. As a result, the disjunctivist holds that when we have veridical experiences, we have direct encounters with objects in the world, and when we have hallucinations, what we experience are sense-data produced by our mind. The disjunctivist view, then, at least allows us to see that we might not be forced into the indirect theory of perception by the existence of hallucinations.

3. Intentional States

So far we have looked at the question what determines the object of any given intentional state, and the question what is the nature of the objects of intentional states. What we have not examined is whether there are broad conditions for a state to count as intentional in the first place. Are only rational creatures capable of intentional states? Are intentional states essentially conscious states? Can we provide an account of intentional states in natural terms?

a. Intentionality and Reason

The centrality of reason to the intentional is an important strand in Kant’s famous Critique of Pure Reason (1787), and has informed an influential line of thinking taken up in the work of Sellars (1956), Strawson (1959) and Brandom (1996). Kant argues that in the apprehension of any object, an individual must have a range of concepts at her disposal that she can use to rationally assess the nature of the object apprehended. In order to apprehend a material object, for example, a thinker must understand what causation is. If she does not understand what causation is, she will not understand that if the material object were to be pushed, it would move. Or if it were picked up and thrown against a wall, it would not go straight through the wall or disappear, but would be caused by the solidity of the wall to bounce backward. Without having the capacity to understand any of these issues, Kant argued, it would not be true to say that an individual apprehends the material object.

The appeal to the necessity of reason for concept-possession often goes hand in hand with the claim that our intentional states are all interdependent. Since I cannot have the concept material object, without the concept cause, then the two concepts depend on one another—and this may be the case for all our concepts, leading to a view known as ‘concept holism’. This raises a puzzle, however, that many think undermines the view. The concern is that if our concepts are interdependent in this way, then if any of my concepts change, all the others change with it. If for example I can only grasp the concept horse if I have the concept animal, then if my animal concept changes in some way, my horse concept will change along with it. If we couple this with the observation that our beliefs about the world are almost constantly being updated, as our day to day experience progresses, then the worry arises that we could literally never have the same thought twice. Any time my beliefs about the world change they will change at least one of my concepts, and if all of my concepts are interdependent, then whenever any of my beliefs change, they will all change. As a result, although it might seem to me that I had thoughts about horses both yesterday and today, this would not be true since the concept that occurred in my thoughts yesterday would not be the same concept as occurred in my thoughts today. Some who think this is an intolerable result adopt the view known as ‘concept atomism’, which holds that our concepts do not stand in essential relations to one another, but only to the external objects they refer to (Fodor and Lepore 1992). Atomism, however, seems to be committed to the claim that I could possess the concept horse without knowing what an animal is, and to the holist that seems as intolerable as concept holism seems to the atomist.

b. Intentionality and Intensionality

Another feature of intentional states that is sometimes thought to be essential is what is called ‘intensionality’ (with an ‘s’). This is the phenomenon whereby the objects of thought are presented to a thinker from a certain point of view—what Frege called a ‘mode of presentation’. We already encountered one of the puzzles that motivate this idea above discussing Frege’s puzzle, where the answer to the question why two concepts can be co-referential without a thinker knowing is proposed to be the fact that a thinker’s concepts pick out an object under a particular mode of presentation.

The potentially essential connection between intentionality and intensionality can be seen when we try to describe someone’s intentional states without bearing in mind their point of view. Recall the beliefs that Lois Lane has about Superman. Lois Lane believes she loves Superman, but does not believe she loves her colleague Clark Kent, not knowing that Superman is Clark Kent. 1) seems like a true description of Lois Lane’s belief about Superman:

(1) Lois Lane believes that she loves Superman

If (1) is true, however, and Superman is Clark Kent, then we might expect that we would state exactly the same thing if we substitute the name ‘Clark Kent’ for the name ‘Superman’ in (1). That would give us (2):

(2) Lois Lane believes that she loves Clark Kent

To many, however, it seems that there is something wrong with (2). If Superman walks into the room in his Clark Kent disguise, Lois will not light up as she does when he walks in without the disguise. If Lois is told that Clark Kent is in trouble, she will not infer that the man she loves in is in trouble. A natural explanation for these facts is that the belief reported in (1) is not the same as the belief reported in (2). Since our reports about the beliefs of others may be false if we do not take into consideration the mode of presentation under which the objects of those beliefs are thought of by the holder of the belief, it seems like intensionality may be an essential feature of intentional states.

Another phenomenon that seems to tie intentionality to intensionality is shown in the fact that we cannot infer from the fact that someone has a belief about x, that x exists. This is unusual, since for most cases of predication (ascription of a property to an object), we can infer from the fact that we have ascribed a property to an object that the object exists. For example, if the claim that the sun is bright is true, it would seem to follow that there must be such a thing as the sun. That is, predication ordinarily permits existential generalization: if a property is truly predicated of an object, then some object with that property exists (Fa → ∃xFx). However, from the fact that I believe the sun is bright, it does not follow that there is such a thing as the sun. After all, I might just as easily believe, as Kant did, that phlogiston is the cause of combustion, but as we know, there is no such thing as phlogiston. If we combine these two claims we get a third claim: that neither the assertion nor the denial of a report of an intentional state entails that the proposition the intentional state is about is true or false. For example, we could truly assert that Kant believed that phlogiston causes combustion, but this does not entail that it is true that phlogiston causes combustion.

Chisholm (1956) thought that an intentional state is any state whose description has these three features: failure to preserve truth given the inter-substitution of co-referring terms (such as ‘Superman’ for ‘Clark Kent’), failure to allow existential generalization entailment (the existence of the intentional object), and failure to entail the truth of the object proposition (such as the belief ascribed to the thinker).

However, these criteria do not seem to hold up for all intentional states. While it does not follow from the fact that Kant believes phlogiston causes combustion that there is such a thing as phlogiston, or that it is true that phlogiston causes combustion, these things would seem to follow if it we held that Kant knew that phlogiston causes combustion. That is, it does not seem possible to have knowledge of things that do not exist, or of propositions that are not true, so if someone knows Fa, then an object with the property F must exist, and if someone knows that p, then p must be true. Knowledge ascriptions therefore do not satisfy the second and third conditions proposed by Chisholm, and yet they are surely intentional states. And perceptual states, which also seem to be intentional states, do not obviously satisfy any of the conditions. You cannot perceive something that does not exist, and you cannot perceive that p is the case if p is not the case, and additionally it is possible to intersubstitute co-referring terms in descriptions of perceptions. If it is true that Jimi Hendrix saw Bob Dylan at Woodstock, then it is true that Jimi Hendrix saw Robert Zimmerman at Woodstock, because Bob Dylan is Robert Zimmerman. Hendrix might not have believed that he saw Robert Zimmerman, or have known that he saw Robert Zimmerman, but nevertheless, if he saw Bob Dylan, he saw Robert Zimmerman. And perceptual states also seem quite clearly to be intentional states.

There is surely an important connection between intentionality and intensionality, then, but how it works in detail is clearly more complex than Chisholm thought.

c. Intentionality and Consciousness

A state of a creature is a conscious state if there is something it is like for the creature to be in that state. There is something it feels like for a person to have their hand pressed onto a hot grill, but there is not anything it feels like for a cheese sandwich to be pressed onto a hot grill. Do these conscious states have an essential connection to intentionality? Might intentionality depend on consciousness, or vice versa?

Some views take conscious states to be a kind of intentional state—thus holding that consciousness depends on intentionality. There are good prima facie grounds for holding this view. It is not obvious how I could be conscious of a horse being before me without my conscious state being directed at, or about, the horse. The idea that conscious states are a species of intentional state can be teased out in various ways. We might say that conscious content is simply intentional content that is available for rational evaluation, so that if I am conscious that it is raining, I have a mental state about the rain that I can reflect upon (Dennett 1991). Or we could say that conscious states always represent the world as being in such-and-such a way, so that if I am conscious that it is raining, I have a mental state that represents the world as being rainy right now (Tye 1995). Or, that conscious states are states that are naturally selected to indicate to a subject that her environment is in such and such a way, and again therefore intentional (Dretske 1995).

However the view that there are ‘raw feels’ in our conscious experience that do not say anything at all about the world also has considerable pull. For example, you might think that when you’re conscious of the warmth of the sun on your face you can indeed reflect upon the fact and judge that it is sunny where you are, but that the warm feeling itself does not tell you that it is sunny. On this view there are two things here, the warm feeling, and the subsequent judgment ‘it is sunny’, which although formed on the basis of the feeling is nevertheless distinct from it (Ryle 1949, Sellars 1956, Peacocke 1983). On this view, conscious states are not intentional in themselves, since they do not in themselves represent the world as being in any particular way, even if they can be used to make judgments about the world.

On the other hand, we might think the dependence runs the other way: that intentional states depend on consciousness. We might suppose that it is hard to make sense of the claim that we could have mental states about the world without the world feeling any way at all to us. Searle (1983), for example, thinks that our notion of the mind essentially involves the notion of consciousness, so he denies that there could be essentially unconscious mental states. To deal with the case of beliefs or desires that I am not currently consciously entertaining, he argues that these must at least have the potential to become conscious in order to be properly understood as mental states.

This dependence claim has its skeptics too, however. The position known as ‘epiphenomenalism’ holds that there is no essential role for consciousness to play in our lives: that consciousness is caused by, but itself plays no causal role in, other mental events. We may happen to have conscious experiences concurrent with some of the events in our lives (such as intentional events), and they may even stand in constant conjunction with those events, but this in itself is not evidence that a creature could not exist that carries out the same activities with no conscious experiences at all. A real life example can get this intuition going. In a phenomenon sometimes called ‘blindsight’, subjects display above chance capacity to discriminate features of their environment while at least reporting that they have no corresponding conscious experience of these features. In one experiment, a subject is shown two drawings of a house, each identical in every respect except that one house is represented as being on fire. When asked, the subjects insist that they can see no difference between the two houses (the house on fire is in the visual region that the subject is having problems with). When pressed on which house they would prefer to live in, however, the subjects show an above chance preference for the house that is not represented as being on fire. Since the subjects seem to have distinct attitudes to the two pictures, hence distinct intentional states directed at each picture, and since there is no apparent variation in conscious experience, some take such cases to motivate the claim that it is possible to have intentional states without any conscious component.

d. Naturalizing Intentionality

Whatever the essence of intentionality might be, a further question that arises is whether we can ‘naturalize’ our account of it. That is to say, whether we can give an account of intentionality that can be exhaustively described in the terms in which the laws of nature are expressed. There is a long tradition of holding that the mind is outside of space and time – that it is an immaterial substance – and on that view, since intentional states are mental states, intentionality could not be naturalized. But particularly in the 20^th century, there has been a push to reject the view that the mind is immaterial and to try to account for the mind in terms of natural processes, such as causal relations, natural selection, and any other process that can be explained in terms of the laws of the natural sciences.

The attempt faces various challenges. We have already looked at one, which is that if we take intentional states to depend on consciousness, and we hold that it is not possible to give a naturalized account of consciousness, then it follows that we cannot naturalize intentionality. But there is another particularly tricky puzzle facing the naturalization of intentionality in terms of causal relations. As we saw above (3b) at least some intentional states have the property of intensionality: it does not follow from the fact that I believe p that p is the case, and it does not follow from the fact that I do not believe p that p is not the case. Another way to put this is that our concepts do not always co-vary with the objects they represent. On the one hand we can encounter the objects our concepts refer to without our concepts triggering, for example, when Lois Lane meets Clark Kent and the thought ‘that’s Superman’ fails to occur to her. And conversely our concepts can be triggered when the object they refer to is not about, such as when I see a cow in the night and mistakenly think ‘there’s a horse’. Our concepts, in other words, can trigger when they should not, and can fail to trigger when they should. This is a problem for naturalizing intentionality, because the causal theory of intentionality (1b) is at the heart of attempts to naturalize intentionality, and the causal theory has trouble explaining intensionality. For example, the causal theory holds that a concept refers to whatever causes it to trigger. But if Lois Lane bumps into Clark Kent and her superman concept fails to trigger, this would suggest that Lois Lane’s superman concept does not refer to Clark Kent. And that’s not a good outcome, since Superman is Clark Kent. Similarly, if I see a cow in the night and my horse concept goes off, the causal account implies that my horse concept refers to cows in the night. And that’s no good either.

Dretske (1981) argues that causal relations can in fact exhibit intensionality, so that we can naturalize intentionality. A compass, he argues, indicates the location of the North Pole because the North Pole causes the compass needle to point at it. He takes a compass to be a ‘natural indicator’ of the North Pole, and so to exhibit natural intentionality. But he thinks the compass also exhibits intensionality. In addition to indicating the North Pole, the compass also indicates the location of polar bears, because there are polar bears at the North Pole. However, if the polar bears move south, the compass will not continue to indicate their location. As a result, suggests Dretske, the compass exhibits intensionality: the compass can fail to indicate the location of polar bears, even though the location of polar bears is the North Pole, just as Lois Lane’s superman concept can fail to indicate Clark Kent, even though Clark Kent is Superman. There is a problem with this account, however, because the relationship between the location of polar bears and the North Pole is very different to the relationship between Superman and Clark Kent. The location of the polar bears can fail to be where the North Pole is, but Clark Kent cannot fail to be where Superman is. That is, the kind of failure to trigger that we are concerned to explain is where a concept fails to trigger in response to what is necessarily identical to its reference – not in response to something that merely happens to be co-instantiated with its reference on some occasions.

Another attempt to allow for these cases within a causal theory appeals to the notion of a natural function or telos (Mathen and Levy 1984, Millikan 1984, Dretske 1995, Papineau 1993). If the heart has been selected by evolution to pump blood, then we can say that the natural function of the heart is to pump blood. But functions can malfunction, as we see when the heart stops, thus failing to continue to pump blood. What distinguishes the correct from the incorrect activities of the heart is whether the heart is doing what it was selected for by evolution. The teleological theory of intentionality proposes that this same mechanism distinguishes the correct and incorrect triggers of a concept. When my horse concept tokens in response to my encounter with a cow in the night, it is malfunctioning, because it was selected to alert me to the presence of horses. This account faces several objections, but the clearest is that it rules out the possibility of a creature having thoughts whose mental states did not come into being through natural selection. Although highly unlikely, it is does not seem impossible that a being formally identical to a thinking person could come into existence by chance, through the right freak coincidence of physical events (in one story it involves lightning hitting a swamp and the right chemicals instantaneously bonding to form a molecule-for-molecule match of an adult human (Davidson 1987)). If the teleological theory of intentionality were right, such a being would have no intentional states since its brain states would have no natural history, even though it would be physically and behaviorally indistinguishable from a thinking person. Many see this is as a reductio ad absurdum of the teleological account, since it seems that by hypothesis such a being would be able to perceive, form desires and beliefs about its environment, and so forth.

Another proposal still is that we can distinguish correct from incorrect triggers of a concept in terms of the relationship they stand in to one another: the incorrect triggers of a concept only cause the concept to trigger because the correct triggers do, but the correct triggers don’t trigger the concept because the incorrect ones do (Fodor 1987). To return to the cow in the night example, the proposal is that if horses didn’t cause my horse concept to trigger, cows in the night wouldn’t either: the reason cows in the night cause it to trigger is because horses cause it to trigger, and cows in the night look like horses. But the reverse is not the case: if cows in the night didn’t cause my horse concept to trigger, this needn’t mean that horses wouldn’t. Correct and incorrect triggers can therefore by identified by this ‘asymmetric dependence’ relation they have to one another. When we try to explain why the correct triggers would continue to cause a concept to token even if the incorrect triggers didn’t, however, the proposal becomes less convincing. Returning to the Twin-Earth example, if we travel to Twin-Earth our water concept will be triggered by the watery looking stuff there, presumably falsely. But since Twin-Earth water is by hypothesis ordinarily indistinguishable from Earth water, it seems wrong to say that if Twin-Earth water did not cause our water concept to trigger, Earth water still would. The reason Earth water causes our water concept to trigger, after all, is presumably because it looks, tastes and smells a certain way. But Twin-Earth water looks, tastes and smells exactly the same way, so it is far from clear why we should expect that if Twin-Earth water did not trigger our water concept Earth water still would. Fodor (1998) replies that we should discount Twin-Earth worries because Twin-Earth does not exist. But it is not clear that this helps, since we could surely discover a substance on Earth that we might not be able to distinguish from water, in which case the same worry can be raised without discussing Twin-Earth.

Needless to say there are further arguments made on behalf of these proposals, but as things stand, there is no widely accepted solution to the problem presented by intensionality for naturalizing intentionality.

4. References and Further Reading

Brandom, R. (1996). Making it Explicit. Harvard University Press.
Brentano, F. (1874/1911/1973). Psychology from an Empirical Standpoint, London: Routledge and Kegan Paul.
Chalmers, D. (1996). The Conscious Mind, Oxford: Oxford University Press.
Chalmers, D. (2006). “Foundations of Two-Dimensional Semantics.” In M. Garcia-Carpintero and J. Macia (eds). Two-Dimensional Semantics: Foundations and Applications. Oxford: Oxford University Press.
Chisholm, R. M. (1956). “Perceiving: a Philosophical Study,” chapter 11, selection in D. Rosenthal (ed.), The Nature of Mind, Oxford: Oxford University Press, 1990.
Davidson, D. (1980). Essays on Events and Actions, Oxford: Clarendon Press.
Davidson, D. (1987). “Knowing One’s Own Mind.” In Proceedings and Addresses of the American Philosophical Association, 60: 441–58.
Dennett, D.C (1991). Consciousness Explained. Boston: Little Brown.
Dretske, F. (1981). Knowledge and the Flow of Information, Cambridge, Mass.: MIT Press.
Dretske, F. (1995). Naturalizing the Mind. Cambridge, Mass.: MIT Press.
Dreyfus, H.L. (ed.) (1982). Husserl, Intentionality and Cognitive Science, Cambridge, Mass.: MIT Press.
Evans, G. (1979). “Reference and Contingency.” The Monist, 62, 2 (April, 1979), 161-189.
Fodor, J.A. (1975). The Language of Thought, New York: Crowell.
Fodor, J.A. (1987). Psychosemantics, Cambridge, Mass.: MIT Press.
Fodor, J.A. (1998). Concepts: Where Cognitive Science Went Wrong, New York: Oxford University Press.
Fodor, J. A. and Lepore, E. (1992). Holism: A Shopper’s Guide. Oxford: Blackwell.
Føllesdal, D. (1982). “Husserl’s notion of Noema,” in H.L. Dreyfus (ed.), The Nature of Mind, Oxford: Oxford University Press.
Frege, G. (1892/1952). “On Sense and Reference.” In P. Geach and M. Black (eds.), Philosophical Writings of Gottlob Frege, Oxford: Blackwell, 1952.
Goodman, N. (1968). Languages of Art: An Approach to a Theory of Symbols. Indianapolis: The Bobbs-Merrill Company.
Haugeland, J. (1981). “Semantic Engines: an Introduction to Mind Design.” In J. Haugeland (ed.), Mind Design, Philosophy, Psychology, Artificial Intelligence, Cambridge, Mass.: MIT Press, 1981.
Hinton, J.M., (1967). “Visual Experiences.” Mind, 76: 217–227.
Husserl, E. (1900/1970). Logical Investigations, (Engl. Transl. by Findlay, J.N.), London: Routledge and Kegan Paul.
Jackson, F. (1998). From Metaphysics to Ethics. Oxford: Oxford University Press.
Kaplan, D. (1979). “Dthat.” In P. French, T. Uehling, and H. Wettstein (eds.), Contemporary Perspectives in the Philosophy of Language, Minneapolis: University of Minnesota Press.
King, P. (2007). “Rethinking Representation in the Middle Ages.” In Representation and Objects of Thought in Medieval Philosophy, edited by Henrik Lagerlund, Ashgate Press: 81-100.
Kim, J. (1993). Mind and Supervenience, Cambridge: Cambridge University Press.
Kripke, S. (1972/1980). Naming and Necessity, Oxford: Blackwell.
Martin, M.G.F. (2002). “The Transparency of Experience.” Mind and Language, 17: 376–425.
Mohan, M. & Levy, E. (1984). “Teleology, Error, and the Human Immune System.” Journal of Philosophy 81 (7):351-372.
McDowell, J. (1994). Mind and World. Oxford: Oxford University Press.
McGinn, C. (1989). Mental Content, Oxford: Oxford University Press.
McGinn, C. (1990). Problems of Consciousness, Oxford: Blackwell.
Mill, J.S. (1884). A System of Logic, Ratiocinative and Inductive: Being a Connected View of the Principles of Evidence and the Methods of Scientific Investigation, New York: Harper.
Millikan, R.G. (1984). Language, Thought and Other Biological Objects, Cambridge, Mass.: MIT Press.
Mooney, T. (2010). “Understanding and Simple Seeing in Husserl.” Husserl Studies, 26: 19-48.
Moore, G.E. (1903). “The Refutation of Idealism.” Mind 12 (1903) 433-53.
Papineau, D. (1993). Philosophical Naturalism. Oxford: Blackwell.
Peacocke, C. (1983). Sense and Content: Experience, Thought and their Relations, Oxford: Oxford University Press.
Putnam, H. (1974). “The Meaning of ‘Meaning’,” in H. Putnam, Philosophical Papers, vol. II, Language, Mind and Reality, Cambridge: Cambridge University Press, 1975.
Recanati, F. (2013). Mental Files. Oxford University Press.
Russell, B. (1905/1956). “On Denoting,” in R. Marsh (ed.), Bertrand Russell, Logic and Knowledge, Essays 1901-1950, New York: Capricorn Books, 1956.
Russell, B. (1911). The Problems of Philosophy, (New York: Holt).
Ryle, G. (1949). The Concept of Mind. Oxford University Press.
Searle, J. (1958). “Do Proper Names have Sense?” Mind 67: 166-173.
Searle, J. (1983). Intentionality, Cambridge: Cambridge University Press.
Searle, J, (1994). “Intentionality (1),” in Guttenplan, S. (ed.) (1994) A Companion Volume to the Philosophy of Mind, Oxford: Blackwell.
Sellars, W. (1956/1997). “Empiricism and the Philosophy of Mind.” In Empiricism and the Philosophy of Mind: with an Introduction by Richard Rorty and a Study Guide by Robert Brandom, R. Brandom (ed.), Cambridge, MA: Harvard University Press.
Snowdon, P.F., (1981). “Perception, Vision and Causation.” Proceedings of the Aristotelian Society, New Series, 81: 175–92.
Soames, S. (2005). Reference and Description: The Case against Two-Dimensionalism. Princeton: Princeton University Press.
Strawson, P. (1959). The Bounds of Sense. Oxford University Press.
Tye, M. (1995). Ten Problems of Consciousness, Cambridge, Mass.: MIT Press.
Varela, F., Thompson, E., and Rosch E., (1991). The Embodied Mind: Cognitive Science and Human Experience, Cambridge, Mass.: MIT Press.
Wittgenstein, L. (1953). Philosophical Investigations. Oxford: Blackwell.

Author Information

Cathal O’Madagain
Email: cathalcom@gmail.com
Ecole Normale Superieure, Paris
France

Ethical Expressivism

Broadly speaking, the term “expressivism” refers to a family of views in the philosophy of language according to which the meanings of claims in a particular area of discourse are to be understood in terms of whatever non-cognitive mental states those claims are supposed to express. More specifically, an expressivist theory of claims in some area of discourse, D, will typically affirm both of the following theses. The first thesis—psychological non-cognitivism—states that claims in D express mental states that are characteristically non-cognitive. Non-cognitive states are often distinguished by their world-to-mind direction of fit, which contrasts with the mind-to-world direction of fit exhibited by cognitive states like beliefs. Some common examples of non-cognitive states are desires, emotions, pro- and con-attitudes, commitments, and so forth. According to the second thesis—semantic ideationalism—the meanings or semantic contents of claims in D are in some sense given by the mental states that those claims express. This is in contrast with more traditional propositional or truth-conditional approaches to meaning, according to which the meanings of claims are to be understood in terms of either their truth-conditions or the propositions that they express.

An expressivist theory of truth claims—that is, claims of the form “p is true”—might hold that (i) “p is true” expresses a certain measure of confidence in, or agreement with, p, and that (ii) whatever the relevant mental state, for example, agreement with p, that state just is the meaning of “p is true”. In other words, when we claim that p is true, we neither describe p as true nor report the fact that p is true; rather, we express some non-cognitive attitude toward p (see Strawson 1949). Similar expressivist treatments have been given to knowledge claims (Austin 1970; Chrisman 2012), probability claims (Barker 2006; Price 2011; Yalcin 2012), claims about causation (Coventry 2006; Price 2011), and even claims about what is funny (Gert 2002; Dreier 2009).

“Ethical expressivism”, then, is the name for any view according to which (i) ethical claims—that is, claims like “x is wrong”, “y is a good person”, and “z is a virtue”—express non-cognitive mental states, and (ii) these states make up the meanings of ethical claims. (I shall henceforth use the term “expressivism” to refer only to ethical expressivism, unless otherwise noted.) This article begins with a brief account of the history of expressivism, and an explanation of its main motivations. This is followed by a description of the famous Frege-Geach Problem, and of the role that it played in shaping contemporary versions of the view. While these contemporary expressivisms may avoid the problem as it was originally posed, recent work in metaethics suggests that Geach’s worries were really just symptoms of a much deeper problem, which can actually take many forms. After characterizing this deeper problem—the Continuity Problem—and some of its more noteworthy manifestations, the article explores a few recent trends in the literature on expressivism, including the advent of so-called “hybrid” expressivist views. See also “Non-Cognitivism in Ethics.”

Expressivism and Non-Cognitivism: History and Motivations
The Frege-Geach Problem and Hare’s Way Out
The Expressivist Turn
The Continuity Problem
Recent Trends
References and Further Reading

1. Expressivism and Non-Cognitivism: History and Motivations

The first and primary purpose of this section is to lay out a brief history of ethical expressivism, paying particular attention to its main motivations. In addition to this, the section will also answer a question that many have had about expressivism, namely: what is the difference between expressivism and “non-cognitivism”?

The difference is partly an historical one, such that a history of expressivism must begin with its non-cognitivist ancestry. Discussions of early non-cognitivism typically involve three figures in particular—A. J. Ayer, C. L. Stevenson, and R. M. Hare—and in that respect, this one will be no different. But rather than focusing upon the substance of their views, in this section, we will be more interested in the main considerations that motivated them to take up non-cognitivism in the first place. As we shall see, early non-cognitivist views were motivated mostly by two concerns: first, a desire to avoid unwanted ontological commitments, especially to a realm of “spooky,” irreducibly normative properties; and second, a desire to capture an apparently very close connection between sincere ethical claims and motivation.

In the case of Ayer, his motivation for defending a version of non-cognitivism was relatively clear, since he explains in the Introduction of the second edition of Language, Truth, and Logic (1946), “[I]n putting forward the theory I was concerned with maintaining the general consistency of my position [logical positivism].” As is well known, logical positivists were rather austere in their ontological accommodations, and happy to let the natural sciences decide (for the most part) what gets accommodated. In fact, a common way to interpret their verificationism is as a kind of method for avoiding unwanted ontological commitments—“unwanted” because they do not conform to what Ayer himself described as his and other positivists’ “radical empiricism.” Claims in some area of discourse are meaningful, in the ordinary sense of that term—which, for Ayer, is just to say that they express propositions—only if they are either analytic or empirically verifiable. Claims that are neither analytic nor empirically verifiable—like most religious claims, for instance—are meaningless; they might express something, but not propositions.

Ayer’s positivism could perhaps make room for moral properties as long as those properties were understood as literally nothing but the natural properties into which philosophers sometimes analyze them—for example, maximizing pleasure, since this is in principle verifiable—but it left no room at all for the irreducibly normative properties that some at the time took to be the very subject-matter of ethics (see Moore 1903). So in order to “maintain the general consistency of his position,” and to avoid any commitment to empirically unverifiable, irreducibly normative properties, Ayer’s positivism meant that he had to construe ordinary ethical claims as expressing something other than propositions. Moreover, for reasons unimportant to my purposes here, he argued that these claims express non-cognitive, motivational states of mind—in particular, emotions. It is for this reason that Ayer’s brand of non-cognitivism is often referred to as “emotivism”.

Stevenson likely shared some of Ayer’s ontological suspicions, but this pretty clearly is not what led him to non-cognitivism. Rather than being concerned to maintain the consistency of any pre-conceived philosophical principles, Stevenson begins by simply observing our ordinary practices of making ethical claims, and then he asks what kind of analysis of “good” is able to make the best sense out of these practices. For instance, in practice, he thinks ethical claims are made more to influence others than to inform them. In fact, in general, Stevenson seems especially impressed with what he called the “magnetism” of ethical claims—that is, their apparently close connection to people’s motivational states. But he thinks that other attempts to analyze “good” in terms of these motivational states have failed on two counts: (a) they make genuine ethical disagreement impossible, and (b) they compromise the autonomy of ethics, assigning ethical facts to the province of psychology, or sociology, or one of the natural sciences.

According to Stevenson, these other theories err in conceiving the connection between ethical claims and motivational states in terms of the former describing, or reporting, the latter—so that, for instance, the meaning of “Torture is wrong” consists in something like the proposition that I (the speaker) disapprove of torture. This is what led to problems (a) and (b) from above: two people who are merely describing or reporting their own attitudes toward torture cannot be genuinely disagreeing about its wrongness; and if the wrongness of torture were really just a matter of people’s attitudes toward it, then ethical inquiries could apparently be settled entirely by such means as introspection, psychoanalysis, or even just popular vote. Stevenson’s non-cognitivism, then, can be understood as an attempt to capture the relation between ethical claims and motivational states in a way that avoids these problems.

The solution, he thinks, is to allow that ethical claims have a different sort of meaning from ordinary descriptive claims. If ordinary descriptive claims have propositional meaning—that is, meaning that is a matter of the propositions they express—then ethical claims have what Stevenson called emotive meaning. “The emotive meaning of a word is a tendency of a word, arising through the history of its usage, to produce (result from) affective responses in people. It is the immediate aura of feeling which hovers about a word” (Stevenson 1937, p.23; see also Ogden and Richards 1923, 125ff). A claim like “Torture is the subject of today’s debate” may get its meaning from a proposition, but the claim “Torture is wrong” has emotive meaning, in that its meaning is somehow to be understood in terms of the motivational states that it is typically used either to express or to arouse.

If Ayer and Stevenson apparently disagreed over the meaningfulness of ethical claims, with Ayer at times insisting that such claims are meaningless, and Stevenson allowing that they have a special kind of non-propositional meaning, they were nonetheless united in affirming a negative semantic thesis, sometimes called semantic non-factualism, according to which claims in some area of discourse—in this case, ethical claims—do not express propositions, and, consequently, do not have truth-conditions. Regardless of whether or not ethical claims are meaningful in some special sense, they are not meaningful in the same way that ordinary descriptive claims are meaningful, that is, in the sense of expressing propositions. Ayer and Stevenson were also apparently united in affirming what we earlier called psychological non-cognitivism. So as the term shall be used here, ‘ethical non-cognitivism’ names any view that combines semantic non-factualism and psychological non-cognitivism, with respect to ethical claims.

According to Hare, ethical claims actually have two kinds of meaning: descriptive and prescriptive. To call a thing “good” is both (a) to say or imply that it has some context-specific set of non-moral properties; this is the claim’s descriptive meaning, and (b) to commend the thing in virtue of these properties (this is the claim’s prescriptive meaning). But importantly, the prescriptive meaning of ethical claims is primary: the set of properties that I ascribe to a thing when calling it “good” varies from context to context, but in all contexts, I use “good” for the purpose of commendation. For Hare, then, ethical claims are used not to express emotions, or to excite the emotions of others, but rather to guide actions. They do this by taking the imperative mood. That is, they are first-and-foremost prescriptions. For this reason, Hare’s view is often called “prescriptivism”.

It may be less clear than it was in the case of Ayer and Stevenson whether Hare’s prescriptivism ought to count as a version of non-cognitivism. After all, it is not uncommon to suppose that sentences in the imperative mood still have propositional content. Since he rarely goes in for talk of “expression”, it is unclear whether Hare is a psychological non-cognitivist. However, it would nonetheless be fair to say that, since prescriptions do not have truth-conditions, Hare is committed to saying that the relationship between prescriptive ethical claims and propositions is fundamentally different from that between ordinary descriptive claims and propositions; and in this sense, it does seem as if he is committed to a form of semantic non-factualism. It also seems right to think that if we do not express any sort of non-cognitive, approving attitude toward a thing when we call it “good,” then we do not really commend it. So even if he is not explicit in his adherence to it, Hare does seem to accept some form of psychological non-cognitivism as well.

Also unclear are Hare’s motivations for being an ethical non-cognitivist. By the time Hare published The Language of Morals (1952), non-cognitivism was already the dominant view in moral philosophy. So there was much less of a need for Hare to motivate the view than there was for Ayer and Stevenson a couple decades earlier. Instead, Hare’s concern was mostly to give a more thorough articulation of the view than the other non-cognitivists had, and one sophisticated enough to avoid some of the problems that had already arisen for earlier versions of the view.

One thing that does appear to have motivated Hare’s non-cognitivism, however, is its ability to explain intuitions about moral supervenience. Most philosophers agree that there is some kind of relationship between a thing’s moral status and its non-moral features, such that two things cannot have different moral statuses without also having different non-moral features. This is roughly what it means to say that a thing’s moral features supervene upon its non-moral features. For example, if it is morally wrong for Stan to lie to his teacher, but not morally wrong for Stan to lie to his mother, then there must be some non-moral difference between the two actions that underlies and explains their moral difference, for example, something to do with Stan’s reasons for lying in each case. While non-philosophers may not be familiar with the term “supervenience”, the fact that we so often hold people accountable for judging like cases suggests that we do intuitively take the moral to supervene upon the non-moral.

Those philosophers, like Moore, who believe in irreducibly normative properties must explain how it is that, despite apparently not being reducible to non-moral properties, these properties are nonetheless able to supervene upon non-moral properties, which has proven to be an especially difficult task (see Blackburn 1988b). But non-cognitivists like Hare do not shoulder this difficult metaphysical burden. Instead, they explain intuitions about moral supervenience in terms of rational consistency. If Joan commends something in virtue of its non-moral properties, but then fails to commend something else with an identical set of properties, then she is inconsistent in her commendations, and thereby betrays a certain sort of irrationality. It is this simple expectation of rational consistency, and not some complicated thesis about the ontological relations that obtain between moral and non-moral properties, that explains our intuitions about moral supervenience.

Not long after Hare’s prescriptivism hit the scene, ethical non-cognitivism would be the target of an attack from Peter Geach. Given that the attack was premised upon a point made earlier by German philosopher Gottlob Frege, it has come to be known as the Frege-Geach Problem for non-cognitivism. In the next section, we will see what the Frege-Geach Problem is. Before doing so, however, let us briefly return to the question raised at the beginning of this section: what is the difference between expressivism and non-cognitivism?

In the introduction, we saw that ethical expressivism is essentially the combination of two theses concerning ethical claims: psychological non-cognitivism and semantic ideationalism. As we will see in Sections 2 and 3, the Frege-Geach Problem pressures the non-cognitivist to say more about the meanings of ethical claims than just the non-factualist thesis that they are not comprised of truth-evaluable propositions. It is partly in response to this pressure that contemporary non-cognitivists have been moved to accept semantic ideationalism. So the difference between expressivism and non-cognitivism is historical, but it is not merely historical. Rather, the difference is substantive as well: both expressivists and non-cognitivists accept some form of psychological non-cognitivism; but whereas the earlier non-cognitivists accepted a negative thesis about the contents of ethical claims—essentially, a thesis about how ethical claims do not get their meanings—contemporary expressivists accept a positive thesis about how ethical claims do get their meanings: ethical claims mean what they do in virtue of the non-cognitive mental states they express. It should be noted, however, that there are still many philosophers who use the terms “non-cognitivism” and “expressivism” interchangeably.

2. The Frege-Geach Problem and Hare’s Way Out

Non-cognitivist theories have met with a number of objections throughout the years, but none as famous as the so-called Frege-Geach Problem. As a point of entry into the problem, observe that there are ordinary linguistic contexts in which it seems correct to say that a proposition is being asserted, and contexts in which it seems incorrect to say that a proposition is being asserted. Consider the following two sentences:

(1) It is snowing.

(2) If it is snowing, then the kids will want to play outside.

In ordinary contexts, to make claim (1) is to assert that it is snowing. That is, when a speaker utters (1), she puts forward a certain proposition—in this case, the proposition that it is snowing—as true. Accordingly, if we happen to know that it is not snowing, it could be appropriate to say that the speaker is wrong. But when a speaker utters (2), she does not thereby assert that it is snowing. Someone can coherently utter (2) without having any idea whether it is snowing, or even knowing that it is not snowing. In the event that it is not snowing, we should not then say that the speaker of (2) is wrong. However, whether “It is snowing” is being asserted or not, it surely means the same thing in the antecedent of (2) as it does in (1). Equally, while we should not say that the speaker of (2) is wrong if it happens not to be snowing, it would nonetheless be correct, in that event, to say that both (1) and the antecedent of (2) are false.

This is what Geach calls “the Frege point,” a reference to German philosopher Gottlob Frege: “A thought may have just the same content whether you assent to its truth or not; a proposition may occur in discourse now asserted, now unasserted, and yet be recognizably the same proposition” (Geach 1965, p.449). The best way to account for the facts that (a) claim (1) and the antecedent of (2) have the same semantic contents, and that (b) they are both apparently capable of truth and falsity, is to suppose that claim (1) and the antecedent of (2) both express the proposition that it is snowing. So apparently, a claim’s expressing a proposition is something wholly independent of what a speaker happens to be doing with the claim, e.g., whether asserting it or not.

Now, we should note two things about the theories of early non-cognitivists like Ayer, Stevenson, and Hare. First, they are meant only to apply to claims in the relevant area of discourse—in this case, ethical claims—and are not supposed to generalize to other sorts of claims. In other words, theirs are apparently specialized, or “local,” semantic theories. So, for instance, most ethical non-cognitivists would agree that claim (1) expresses the proposition that it is snowing, and that this accounts for the meaning of (1). Second, perhaps understandably, ethical non-cognitivists focus their theories almost entirely upon ethical claims when they are asserted. The basic question is always something like this: what really is going on when a speaker makes an assertion of the form ‘x is wrong’? Does the speaker thereby describe x as wrong? Or might it be a kind of fallacy to assume that the speaker is engaged in an act of description, based only upon the surface grammar of the sentence? Might she instead be doing something expressive or evocative? Geach observes, “Theory after theory has been put forward to the effect that predicating some term ‘P’—which is always taken to mean: predicating ‘P’ assertorically—is not describing an object as being P but some other ‘performance’; and the contrary view is labeled ‘the Descriptive Fallacy’” (Geach 1965, p.461). Little attention is paid to ethical claims in contexts where they are not being asserted.

The Frege-Geach Problem can be understood as a consequence of these two features of non-cognitivist theories. As we saw earlier with claims (1) and (2), when we embed a claim into an unasserted context, like the antecedent of a conditional, we effectively strip the claim of its assertoric force. Claim (1) is assertoric, but the antecedent of (2) is not, despite having the same semantic content. But as Geach points out, exactly the same phenomenon occurs when we take a claim at the heart of some non-cognitivist theory and embed it into an unasserted context. This is why the Frege-Geach Problem is sometimes called the Embedding Problem. For example, consider the following two claims, similar in form to claims (1) and (2):

(3) Lying is wrong.

(4) If lying is wrong, then getting your little brother to lie is wrong.

As with claims (1) and (2) above, the relationship between a speaker and claim (3) is importantly different from the relationship between a speaker and the antecedent of claim (4). At least in ordinary contexts, a speaker of (3) asserts that lying is wrong, whereas a speaker of (4) does no such thing. But, assuming “the Frege point” applies here as well, the semantic contents of (3) and the antecedent of (4) do not depend upon whether they are being asserted or not. In both cases, their contents ought to be the same; and therein lies the rub for ethical non-cognitivists.

Given that their theories are meant to apply only to ethical claims, and not to claims in other areas of discourse, non-cognitivists are apparently committed to telling a radically different story about the semantic content of (3), as compared to the propositional story they would presumably join everyone else in telling about the contents of claims like (1) and (2). But whatever story they tell about the content of (3), it is unclear how it could apply coherently to the antecedent of (4) as well. Take Ayer, for instance. According to Ayer, claim (3) is semantically no different from

(3’) Lying!!

“where the shape and thickness of the exclamation marks show, by a suitable convention, that a special sort of moral disapproval is the feeling which is being expressed” (Ayer (1946)1952, p.107). Ayer believed that speakers of claims like (3) are not engaged in acts of description, but rather acts of expressing their non-cognitive attitudes toward various things. This is how Ayer’s theory treats the contents of ethical claims when they are asserted. Now, absent some independently compelling reason for thinking that “the Frege point” should not apply here, the same analysis ought to be given to the antecedent of (4). But the same analysis cannot be given to the antecedent of (4). For, just as a speaker can sincerely and coherently utter (2) without believing that it is snowing, a speaker can sincerely and coherently utter (4) without disapproving of lying. So whatever Ayer has to say about the content of the antecedent of (4), it cannot be that it consists in the expression of “a special sort of moral disapproval,” since a speaker of (4) does not express disapproval of lying. Apparently, then, he is committed to saying, counter-intuitively, that the contents of (3) and the antecedent of (4) are different.

As Geach poses it, the problem for the ethical non-cognitivist at this point is actually two-fold (see especially Geach 1965: 462-465). First, the non-cognitivist must explain how ethical claims are able to function as premises in logical inferences in the first place, if they do not express propositions. Traditionally, inference in logic is thought to be a matter of the truth-conditional relations that hold between propositions, and logical connectives like “and”, “or”, and “if-then” are thought to be truth-preserving functions from propositions to propositions. But as we have already seen, ethical non-cognitivists deny that ethical claims are even in the business of expressing propositions. So how, Geach wonders, are we apparently able to infer

(5) Therefore, getting your little brother to lie is wrong

from (3) and (4), if the content of (3) is nothing more than an attitude of disapproval toward lying? Or consider:

(6) Lying is wrong or it isn’t.

Claim (6) can be inferred from (3) by a familiar logical principle, and in non-ethical contexts, we account for this by explaining how disjunction relates two or more propositions. But how can someone who denies that (3) expresses a proposition explain the relationship between (3) and (6)? The second part of the problem, related to the first, is that the non-cognitivist must explain why the inference from (3) and (4) to (5), for instance, is a valid one. As any introductory logic student knows well, the validity of modus ponens depends upon the minor premise and the antecedent of the major premise having the same content. Otherwise, the argument equivocates, and the inference is invalid. But as we just saw, on the theories of non-cognitivists like Ayer, claim (3) and the antecedent of (4) apparently do not have the same content. So Ayer seems committed to saying that what appears to be a straightforward instance of modus ponens is in fact an invalid argument. This is the so-called Frege-Geach Problem for non-cognitivism as Geach originally put it.

In response to an argument very much like Geach’s (see Searle 1962), Hare appears to give non-cognitivists a “way out” of the Frege-Geach Problem (Hare 1970). As Hare sees it, the matter ultimately comes down to whether or not the non-cognitivist can adequately account for the compositionality of language, that is, the way the meanings of complex sentences are composed of the meanings of their simpler parts. As has already been noted, linguists and philosophers of language have traditionally done this by telling a story about propositions and the various relations that may hold between them—the meaning of (2), for instance, is composed of (a) the proposition that it is snowing, (b) the proposition that the kids will want to play outside, and (c) the conditional function “if-then”. The challenge for the non-cognitivist is simply to find another way to account for compositionality—though, it turns out, this is no simple matter.

Hare’s own proposal was to think of the meanings of ethical claims in terms of the sorts of acts for which they are suited and not in terms of propositions or mental states. The claim “Lying is wrong,” for instance, is especially suited for a particular sort of act, namely, the act of condemning lying. Thinking of the meanings of ethical claims in this way allows Hare and other non-cognitivists to effectively concede “the Frege point,” since suitability for an act is something wholly independent of whether a claim is being asserted or not. It allows them, for instance, to say that the content of (3) is the same as the content of the antecedent of (4), which, we saw, was a problem for theories like Ayer’s. From here, accounting for the meanings of complex ethical claims, like (4) and (6), is a matter of conceiving logical connectives not as functions from propositions to propositions, but rather as functions from speech acts to speech acts. If non-cognitivists could do something like this, that is, draw up a kind of “logic of speech acts”, then they would apparently have the resources for meeting both of Geach’s challenges. They could explain how ethical claims can function as premises in logical inferences, and they could explain why some of those inferences, and not others, are valid. Unfortunately, Hare himself stopped short of working out such a logic, but his 1970 paper would nonetheless pave the way for future expressivist theories and their own responses to the Frege-Geach Problem.

3. The Expressivist Turn

Earlier, it was noted that the difference between non-cognitivism and expressivism is both historical and substantive. To repeat, ethical non-cognitivists were united in affirming the negative semantic thesis that ethical claims do not get their meanings from truth-evaluable propositions, as in semantic non-factualism. But as we have already seen with Hare, the Frege-Geach Problem pressures non-cognitivists to say something more than this, in order to account for the meanings of both simple and complex ethical claims, and to explain how some ethical claims can be inferred from others.

Contemporary ethical expressivists respond to this pressure by doing just that: while still affirming the semantic non-factualism of their non-cognitivist ancestors, expressivists nowadays add to this the thesis that was earlier called semantic ideationalism. That is, they think that the meanings of ethical claims are constituted not by propositions, but by the very non-cognitive mental states that they have long been thought to express. In other words, if non-cognitivists “removed” propositions from the contents of ethical claims, then expressivists “replace” those propositions with mental states, or “ideas”—hence, ideationalism. It is this move, made primarily in response to the Frege-Geach Problem, and by following Hare’s lead, that constitutes the historical turn from ethical non-cognitivism to ethical expressivism. Both non-cognitivists and expressivists believe that ethical claims express non-cognitive attitudes, but expressivists are distinguished in thinking of the expression relation itself as a semantic one.

Ethical expressivism is often contrasted with another theory of the meanings of ethical claims according to which those meanings are closely related with speaker’s non-cognitive states of mind, namely, ethical subjectivism. Ethical subjectivism can be understood as the view that the meanings of ethical claims are propositions, but propositions about speakers’ attitudes. So whatever the relationship between claim (1) above and the proposition that it is snowing, the same relationship holds between claim (3) and the proposition that I (the speaker) disapprove of lying. So ethical subjectivists can also, with expressivists, say that ethical claims mean what they do in virtue of the non-cognitive states that they express. But whereas the expressivist accounts for this in terms of the way the claim itself directly expresses the relevant state, the subjectivist accounts for it in terms of the speaker indirectly expressing the relevant state by expressing a proposition that refers to it.

The contrast between expressivism and subjectivism is important not only for the purpose of understanding what expressivism is, but also for seeing a significant advantage that it is supposed to have over subjectivism. Suppose Jones and Smith are engaged in a debate about the wrongness of lying, with Jones claiming that it is wrong, and Smith claiming that it is not wrong. Presumably, for this to count as a genuine disagreement, it must be the case that their claims have incompatible contents. But according to subjectivism, the contents of their claims, respectively, are the propositions that I (Jones) disapprove of lying and that I (Smith) do not disapprove of lying. Clearly, though, these two propositions are perfectly compatible with each other. Where, then, where is the disagreement? This is often thought to be a particularly devastating problem for ethical subjectivism, that is, it cannot adequately account for genuine moral disagreement, but it is apparently not a problem for expressivists. According to expressivism, the disagreement is simply a matter of Jones and Smith directly expressing incompatible states of mind. This is one of the advantages of supposing that the semantic contents of ethical claims just are mental states, and not propositions about mental states.

Now, recall the two motivations that first led people to accept ethical non-cognitivism. The first was a desire to avoid any ontological commitment to “spooky,” irreducibly normative properties. Moral realists, roughly speaking, are those who believe that properties like goodness and wrongness have every bit the ontological status as other, less controversial properties, like roundness and solidity, that is, moral properties are no less “real” than non-moral properties. But especially for those philosophers committed to a thoroughgoing metaphysical naturalism, it is hard to see how things like goodness and wrongness could have such a status. Especially when it is noted, as Mackie famously does, that moral properties as realists typically conceive them are somehow supposed to have a kind of built-in capacity to motivate those who apprehend them, to say nothing of how they are supposed to be apprehended, a capacity apparently not had by any other property (see Mackie 1977, p.38-42). Ethical expressivists avoid this problem by denying that people who make ethical claims are even engaged in the task of ascribing moral properties to things in the first place. Ontologically speaking, expressivism demands little more of the world than people’s attitudes and the speakers who express them, and so, it nicely satisfies the first of the two non-cognitivist desiderata.

The second desideratum was a desire to accommodate an apparently very close connection between ethical claims and motivation. In simple terms, motivational internalism is the view that a necessary condition for moral judgment is that the speaker be motivated to act accordingly. In other words, if Jones judges that lying is wrong, but has no motivation whatsoever to refrain from lying, or to condemn those who lie, or whatever, then internalists will typically say that Jones must not really judge lying to be wrong. Even if motivational internalism is false, though, it is surely right that we expect people’s ethical claims to be accompanied by motivations to act in certain ways; and when people who make ethical claims seem not to be motivated to act in these ways, we often assume either that they are being insincere or that something else has gone wrong. Sincere ethical claims just seem to “come with” corresponding motivations. Here, too, expressivism seems well suited to account for this feature of ethical claims, since they take ethical claims to directly express non-cognitive states of mind, for example, desires, emotions, attitudes, commitments, and these states are either capable of motivating by themselves, or at least closely tied to motivation. So while ethical expressivists distinguish themselves from earlier non-cognitivists by accepting the thesis of semantic ideationalism, they are no less capable of accommodating the very same considerations that motivated non-cognitivism in the first place.

Finally, return to the Frege-Geach Problem. As we saw in the previous section, Geach originally posed it as a kind of logical problem for non-cognitivists: by denying that claims in the relevant area of discourse express propositions, non-cognitivists take on the burden of explaining how such claims can be involved in logical inference, and why some such inferences are valid and others invalid. Hare took a first step toward meeting this challenge by proposing that we understand the contents of ethical claims in terms of speech acts, and then work out a kind of “logic” of speech acts. Contemporary expressivists, since they understand the contents of ethical claims not in terms of speech acts but in terms of mental states, are committed to doing something similar with whatever non-cognitive states they think are expressed by these claims. In other words, as it is sometimes put, expressivists owe us a kind of “logic of attitudes.”

Here, again, is our test case:

(3) Lying is wrong.

(4) If lying is wrong, then getting your little brother to lie is wrong.

(5) Therefore, getting your little brother to lie is wrong.

If the meanings of (3), (4), and (5) are to be understood solely in terms of mental states, and not in terms of propositions, how is it that we can infer (5) from (3) and (4)? And why is the inference valid?

In some of his earlier work on this, Blackburn (1984) answers these questions by suggesting that complex ethical claims like (4) express higher-order non-cognitive states, in this case, something like a commitment to disapproving of getting one’s little brother to lie upon disapproving of lying. If someone sincerely disapproves of lying, and is also committed to disapproving of getting her little brother to lie as long as she disapproves of lying—the two states expressed by (3) and (4), respectively—then she thereby commits herself to disapproving of getting her little brother to lie. This is one sense in which (5) might “follow from” (3) and (4), even if it is not exactly the entailment relation with which we are all familiar from introductory logic.

Furthermore, a familiar way to account for the validity of inferences like (3)-(5) is by saying that it is impossible for the premises to be true and for the conclusion to be false. But if the expressivist takes something like the approach under consideration here, he will presumably have to say something different, since it is certainly possible for someone to hold both of the attitudes expressed by (3) and (4) without also holding the attitude expressed by (5). So for instance, the expressivist might say something like this: while a person certainly can hold the attitudes expressed by (3) and (4) without also holding the attitude expressed by (5), such a person would nonetheless exhibit a kind of inconsistency in her attitudes—she would have what Blackburn calls a “fractured sensibility” (1984: 195). It is this inconsistency that might explain why the move from (3) and (4) to (5) is “valid,” provided that we allow for this alternative sense of validity. Recall, that this is essentially the same sort of inconsistency of attitudes that Hare thought underlies our intuitions about moral supervenience.

This is just one way in which expressivists might attempt to solve the Frege-Geach Problem. Others have attempted different sorts of “logics of attitudes,” with mixed results. In early twenty-first century discourse, the debate about whether such a thing as a “logic of attitudes” is even possible—and if so, what it should look like—is ongoing.

4. The Continuity Problem

Even if expressivists can solve, or at least avoid, the Frege-Geach Problem as Geach originally posed it, there is a deeper problem that they face, a kind of “problem behind the problem”, and that will be the subject of this section. To get a sense of the problem, consider that expressivists have taken a position that effectively pulls them in two opposing directions. On the one hand, since the earliest days of non-cognitivism, philosophers in the expressivist tradition have wanted to draw some sort of sharp contrast between claims in the relevant area of discourse and claims outside of that area of discourse, that is, between ethical and non-ethical claims. But on the other hand, and this is the deeper issue that one might think lies behind the Frege-Geach Problem, ethical claims seem to behave in all sorts of logical and semantic contexts just like their non-ethical counterparts. Ethical claims are apparently no different from non-ethical claims in being (a) embeddable into unasserted contexts, like disjunctions and the antecedents of conditionals, (b) involved in logical inferences, (c) posed as questions, (d) translated across different languages, (e) negated, (f) supported with reasons, and (g) used to articulate the objects of various states of mind, for example, we can say that Jones believes that lying is wrong, Anderson regrets that lying is wrong, and Black wonders whether lying is wrong, to name just a few. It is in accounting for the many apparent continuities between ethical and non-ethical claims that expressivists run into serious problems. So call the general problem here the Continuity Problem for expressivism.

One very significant step that expressivists have taken in order to solve the Continuity Problem is to expand their semantic ideationalism to apply to claims of all sorts, and not just to claims in the relevant area of discourse. So, in the same way that ethical claims get their meanings from non-cognitive mental states, non-ethical claims get their meanings from whatever states of mind they express. In other words, expressivists attempt to solve the Continuity Problem by swapping their “local” semantic ideationalism, that is, ideationalism specifically with respect to claims in the discourse of concern, for a more “global” ideationalist semantics intended to apply to claims in all areas of discourse. This is remarkable, as it represents a wholesale departure from the more traditional propositionalist semantics according to which sentences mean what they do in virtue of the propositions they express. Recall the earlier claims:

(1) It is snowing.

(3) Lying is wrong.

According to most contemporary expressivists, the meanings of both (1) and (3) are to be understood in terms of the mental states they express. Claim (3) expresses something like disapproval of lying, and claim (1) expresses the belief that it is snowing, as opposed to the proposition that it is snowing. So even if ethical and non-ethical claims express different kinds of states, their meanings are nonetheless accounted for in the same way, that is, in terms of whatever mental states the relevant claims are supposed to express.

If nothing else, this promises to be an important first step toward solving the Continuity Problem. But taking this step, from local to global semantic ideationalism, may prove to be more trouble than it is worth, as it appears to raise all sorts of other problems a few of which we shall consider here under the general banner of the Continuity Problem.

a. A Puzzle about Negation

Keeping in mind that expressivism now appears to hinge upon it being the case that an ideationalist approach to semantics can account for all of the same logical and linguistic phenomena that the more traditional propositional or truth-conditional approaches to semantics can account for, consider a simple case of negation:

(1) It is snowing.

(7) It is not snowing.

On an ideationalist approach to meaning, (1) gets its meaning from the belief that it is snowing, and (7) gets its meaning from either the belief that it is not snowing, or perhaps a state of disbelief that it is snowing, assuming, for now, that a state of disbelief is something different from a mere lack of belief. A claim and its negation ought to have incompatible contents, and this is apparently how an ideationalist would account for the incompatibility of (1) and (7). But now consider a case of an ethical claim and its negation:

(3) Lying is wrong.

(8) Lying is not wrong.

We saw these claims earlier, in Section 3, when discussing how expressivists are supposed to be able to account for genuine moral disagreement in a way better than ethical subjectivists. Basically, expressivists account for such disagreement by supposing that a speaker of (3) and a speaker of (8) express incompatible mental states, as is the case with (1) and (7). But if the incompatible states in the case of (1) and (7) are states of belief that p and belief that not-p (or belief and disbelief), what are the incompatible states in this case?

The non-cognitive mental state expressed by (3) is presumably something like disapproval of lying. So what is the non-cognitive state that is expressed by (8)? On the face of it, this seems like it should be an easy question to answer, but upon reflection, it turns out to be really quite puzzling. Whatever is expressed by (8), it should be something that is independently plausible as the content of such a claim, and it should be something that is somehow incompatible with the state expressed by (3). But what is it?

To see why this is puzzling, consider the following three sentences (adapted from Unwin 1999 and 2001):

(9) Jones does not think that lying is wrong.

(10) Jones thinks that not lying is wrong.

(11) Jones thinks that lying is not wrong.

These three sentences say three importantly different things about Jones. Furthermore, it seems as if the state attributed to Jones in (11) should be the very same state as the one expressed by (8) above. But again, what is that state? Let us proceed by process of elimination. It cannot be that (11) attributes to Jones a state of approval, that is, approving of lying. Presumably, for Jones to approve of lying would be for Jones to think that lying is right, or good. But that is not what (11) says; it says only that he thinks lying is not wrong. Nor can (11) attribute to Jones a lack of disapproval of lying, since that is what is attributed in (9), and as we’ve already agreed, (9) and (11) tell us different things about Jones. Moreover, (11) also cannot attribute to Jones the state of disapproval of not lying, since that is the state being attributed in (10). But at this point, it is hard to see what mental state is left to be attributed to Jones in (11), and to be the content of (8).

The expressivist does not want to say that (3) and (8) express incompatible beliefs, or states of belief and disbelief, as with (1) and (7), since beliefs are cognitive states, and we know that expressivists are psychological non-cognitivists. If (3) and (8) express beliefs, and we share with Hume the idea that beliefs by themselves are incapable of motivating, then we will apparently not have the resources for explaining the close connection between people sincerely making one of these claims and their being motivated to act accordingly. Nor does the expressivist want to say that (3) and (8) express inconsistent propositions, since that would be to abandon her semantic non-factualism. Propositions are often thought to determine truth conditions, and truth conditions are often thought to be ways the world might be. So to allow that (3) and (8) express propositions would presumably be to allow that there is a way the world might be that would make it true that lying is wrong. Furthermore, accounting for this would involve the expressivist in precisely the sort of moral metaphysical inquiries she seeks to avoid. For these reasons, it is crucial for the expressivist to find a non-cognitive mental state to be the content of (8). It must be something incompatible with the state expressed by (3), and it must be a plausible candidate for the state attributed to Jones in (11). But as we have seen, it is very difficult to articulate just what state it is.

Expressivists must show us that, even after accepting global semantic ideationalism, we are still able to account for all of the same phenomena as those accounted for by traditional propositional approaches to meaning. But here it seems they struggle even with something as simple as negation. Further, until they provide a satisfactory explanation of the contents of negated ethical claims, it will remain unclear whether they really do have a better account of moral disagreement than ethical subjectivists, as has long been claimed.

b. Making Sense of Attitude Ascriptions

Earlier, it was noted that ethical claims are no different from non-ethical claims in being able to articulate the objects of various states of mind. Let us now look closer at why expressivists may have a problem accounting for this particular point of continuity between ethical and non-ethical discourse.

(12) Frank fears that it is snowing.

(13) Wanda wonders whether it is snowing.

(14) Haddie hates that it is snowing.

Claims (12)-(14) ascribe three different attitudes to Frank, Wanda, and Haddie. Clearly, however, these three attitudes have something in common, something that can be represented by the claim from earlier

(1) It is snowing.

Traditionally, the way that philosophers of mind and language have accounted for this is by saying that (1) expresses the proposition that it is snowing, and that what all three of the attitudes ascribed to Frank, Wanda, and Haddie have in common is that they are all directed at one and the same proposition, that is, they all have the same proposition as their object.

By abandoning traditional propositional semantics, though, expressivists take on the burden of finding some other way of explaining how the contents of expressions like “fears that”, “wonders whether”, and “hates that” are supposed to relate to the content of whatever follows them. If the content of (1) is supposed to be something like the belief that it is snowing, as ideationalists suppose, and (1) is also supposed to be able to articulate the object of Frank’s fear, then the expressivist is apparently committed to thinking that Frank’s fear is actually directed at the belief that it is snowing. But, of course, Frank is not afraid of the belief that it is snowing—he is not afraid to believe that it is snowing—rather, he is afraid that it is snowing.

Things are no less problematic in the ethical case. For consider:

(15) Frank fears that lying is wrong.

(16) Wanda wonders whether lying is wrong.

(17) Haddie hates that lying is wrong.

Here again, it seems right to say that the attitudes ascribed in (15)-(17) all share something in common, something that can be represented by the claim from earlier

(3) Lying is wrong.

But if it is denied that (3) expresses a proposition, as ethical expressivists and non-cognitivists always have, it becomes unclear how (3) could be used to articulate the object of those attitudes. Focus upon (15) for a moment. Now, what are the contents of ‘fears that’ and ‘lying is wrong’, such that the latter is the object of the former? We presumably have one answer already, from the expressivist: the content of ‘lying is wrong’ in (15), like the content of (3), is an attitude of disapproval toward lying. However, on the plausible assumption that the content of “fears that” is an attitude of fear toward the content of whatever follows, we apparently get the expressivist saying that (15) ascribes to Frank a fear of disapproval of lying, or a fear of disapproving of lying. But surely that is not what (15) ascribes to Frank. He may fear these other things as well, but (15) says only that he fears that lying is wrong.

The expressivist may try to avoid this puzzle by insisting that “lying is wrong” as it appears in (15) has a content that is different from the content of (3), but this still leaves us wondering what the meanings of claims like (15)-(17) are supposed to be, according to the expressivist’s ideationalist semantics. As Schroeder explains, expressivists “owe an account of the meaning of each and every attitude verb, for example, fears that, wonders whether, and so on; just as much as they owe an account of “not”, “and”, and “if … then”. Very little progress has yet been made on how non-cognitivists [or expressivists] can treat attitude verbs, and the prospects for further progress look dim” (Schroeder 2008d, p.716).

c. Saving the Differences

One might think that a simple way to defeat any non-factualist account of ethical claims is simply to point out that we can coherently embed ethical claims into truth claims. It makes perfect sense, for instance, for someone to say, “It is true that lying is wrong.” Presumably, however, this could only make sense if whatever follows “It is true that” is the sort of thing that can be true. Of course, propositions are among the sorts of things that can be true, in fact, this is often thought to be their distinguishing characteristic. But non-factualists deny that ethical claims express propositions. So how do they account for the fact that the truth-predicate seems to apply just as well to ethical claims as it does to non-ethical claims?

If this were a devastating problem for non-cognitivists, then the non-cognitivist tradition in ethics would not have lasted for very long, since philosophers were well aware of the matter soon after Ayer first published Language, Truth, and Logic in 1936. The thought then—essentially just an application of Ramsey’s (1927) famous redundancy theory of truth—was that, in at least some cases, the truth-predicate does not actually ascribe some metaphysically robust property being true to whatever it is being predicated of. Rather, to add the truth-predicate to a claim is to do nothing more than to simply assert the claim by itself. In claiming that “It is true that lying is wrong,” on this view, a speaker expresses the very same state that is expressed by claiming only that “Lying is wrong,” and nothing more; hence, the “redundancy” of the truth predicate.

In early twenty-first century discourse, theories like Ramsey’s are referred to as deflationary or minimalist theories of truth, since they effectively “deflate” or “minimize” the ontological significance of the truth-predicate. Some ethical expressivists, in part as a way of solving the Continuity Problem, have taken to supplementing their expressivism with deflationism. The basic idea goes something like this: if we accept a deflationary theory of truth across the board, we can apparently say that ethical claims are truth-apt, in fact, every bit as truth-apt as any other sort of claim. This allows the expressivist to avoid simple versions of the objection noted at the beginning of this section. Interestingly, the deflationism need not stop with the truth-predicate. We might also deflate the notion of a proposition by insisting that a proposition is just whatever is expressed by a truth-apt claim. As long as we allow that ethical claims are truth-apt, in some deflationary sense, we may now be able to say, for instance, that

(3) Lying is wrong

expresses the proposition that lying is wrong, after all. If this is allowed, then the expressivist may now have the resources for accounting for the compositionality of ethical discourse in basically the same way in which traditional propositional semanticists would do so. The meanings of complex ethical claims are to be understood in terms of the propositions expressed by their parts. Once the notion of a proposition is deflated, we might just as well deflate the notion of belief by saying something to the effect that all it is for one to believe that p is for one to accept a claim that expresses the proposition that p. In these ways, perhaps an expressivist can “earn the right” to talk of truth, propositions, and beliefs, and perhaps also knowledge, in the ethical domain, just as they do in non-ethical domains.

This is the essence of Blackburn’s brand of expressivism, known commonly nowadays as ‘quasi-realism’. As we saw earlier, moral realists are those who believe that moral properties have every bit the ontological status as other, less controversial properties, like roundness and solidity. This allows realists to account for things like truth, propositions, beliefs, and knowledge in the ethical domain in precisely the same way that we ordinarily do in other domains, such as those that include facts about roundness and solidity. By deflating the relevant notions, however, Blackburn and other moral non-realists are nonetheless supposed to be able to say all the things that realists say about moral truth, and the like; hence, “quasi”-realism.

There are at least two problems for ethical expressivists who take this approach to solving the Continuity Problem. The first is simply that deflationism is independently a very controversial view. In his own defense of a deflationary theory of truth, Paul Horwich addresses no fewer than thirty-nine “alleged difficulties” faced by such a theory (Horwich 1998). Granted, he apparently believes that all of these difficulties can be addressed with some degree of satisfaction, but few will deny that deflationary theories of truth represent a departure from the common assumption that truth is a real property of things, and that this property consists in something like a thing’s corresponding with reality. Deflationism may help expressivists avoid the Continuity Problem, but at the cost of then burdening them to defend deflationism against its many problems.

A second and more interesting problem, though, is that taking this deflationary route may, in the end, ruin what was supposed to be so unique about expressivism all along. In other words, there is a sense in which deflationism may too good a response to the Continuity Problem. After all, at the core of ethical expressivism is the belief that there is some significant difference between ethical and non-ethical discourse. Recall again our two basic instances of each:

(1) It is snowing.

(3) Lying is wrong.

As we just saw, once deflationism is allowed to run its course, we end up saying remarkably similar things about (1) and (3). Both are truth-apt; both express propositions; both can be the objects of belief; both can be known; and so forth. But now you may be wondering: what, then, is supposed to be the significant difference that sets (3) apart from (1)? Or, another way of putting it: what would be the point of contention between an expressivist and her opponents if both parties agreed to deflate such notions as truth, proposition, and belief? This has sometimes been called the problem of “saving the differences” between ethical and non-ethical discourse.

One response to this problem might be to say that the relevant differences between ethical and non-ethical discourse actually occur at a level below the surface of the two linguistic domains. Recall that we deflated the notion of belief by saying that to believe that p is just to accept a claim that expresses the proposition that p. Using these terms, the expressivist might say that the main difference between (1) and (3) is a matter of what is involved in “accepting” the two claims. Accepting an ethical claim like (3) is something importantly different from accepting a non-ethical claim like (1), and presumably the difference has something to do with the types of mental states involved in doing so. Whether or not this sort of response will work is the subject of an ongoing debate in early twenty-first century philosophical literature.

5. Recent Trends

While the Continuity Problem remains a lively point, or collection of points, of debate between expressivists and their critics, it is certainly not the only topic with which those involved in the literature are currently occupied. Here we review a few other recent trends in expressivist thought, perhaps the most notable among them being the advent of so-called “hybrid” expressivist theories.

a. Expressivists’ Attitude Problem

There are some who would say that the Continuity Problem just is the Frege-Geach Problem, that is, that the Frege-Geach Problem ought to be understood very broadly, so as to include all of the many issues associated with the apparent logical and semantic continuities between ethical and non-ethical discourse. Even so, ethical expressivism faces other problems as well. Let us now look briefly at an issue that is receiving more and more attention these days—the so-called Moral Attitude Problem for ethical expressivism.

Recall again that expressivists often claim to have a better way of accounting for the nature of moral disagreement than the account on offer from ethical subjectivists. The idea, according to the expressivist, is supposed to be that a moral disagreement is ultimately just a disagreement in non-cognitive attitudes. Rather than expressing propositions about their opposing attitudes—which, we saw earlier, would be perfectly compatible with each other—the two disagreeing parties directly express those opposing non-cognitive attitudes. But then, in our discussion of the puzzle about negation, we saw that the expressivist may actually owe us more than this. Specifically, she owes us an explanation of what, exactly, those opposing attitudes are supposed to be. If Jones claims that lying is wrong, and Smith claims that it is not wrong, then Jones and Smith are engaged in a moral disagreement about lying. The expressivist, presumably, will say that Jones expresses something like disapproval of lying. But then what is the state that is directly expressed by Smith’s claim, such that it is disagrees, or is incompatible, with Jones’ disapproval?

According to the Moral Attitude Problem, the issue actually runs deeper than this, for there are more constraints on the expressivist’s answer than just that the state expressed by Smith be something incompatible with Jones’ disapproval of lying. In fact, Jones’ disapproval of lying may turn out to be no less mysterious than whatever sort of state is supposed to be expressed by Smith. After all, we disapprove of all sorts of things. Suppose that Jones also disapproves of Quentin Tarantino movies, but Smith does not. Presumably, this would not count as a moral disagreement, despite the fact that Jones and Smith are expressing mental states similar to those expressed in their disagreement about lying. So what is it, according to ethical expressivism, that makes the one disagreement, and not the other, a moral disagreement? This is especially puzzling given that expressivists often clarify their view by saying that moral disagreements are more like aesthetic disagreements, like a disagreement over Tarantino films; than they are like disagreements over facts, such as whether or not it is snowing.

So the Moral Attitude Problem, basically, is the problem of specifying the exact type, or types, of attitude expressed by ethical claims, such that someone expressing the relevant state counts as making an ethical claim—as opposed to an aesthetic claim, or something else entirely. Judith Thomson raises something like the Moral Attitude Problem when she writes,

The [ethical expressivist] needs to avail himself of a special kind of approval and disapproval: these have to be moral approval and moral disapproval. For presumably he does not wish to say that believing Alice ought to do a thing is having toward her doing it the same attitude of approval that I have toward the sound of her splendid new violin. (Thomson 1996, p.110)

And several years later, in a paper entitled “Some Not-Much-Discussed Problems for Non-Cognitivism in Ethics,” Michael Smith raises the same problem:

[Ethical expressivists] insist that it is analytic that when people sincerely make normative claims they thereby express desires or aversions. But which desires and aversions … , and what special feature do they possess that makes them especially suitable for expression in a normative claim? (Smith 2001, p.107)

But it is only very recently that expressivists and their opponents have begun to give the Moral Attitude Problem the attention that it deserves (see Merli 2008; Kauppinen 2010; Köhler 2013; Miller 2013, pp.39-47, pp.81-87; and Björnsson and McPherson 2014)

What can the expressivist say in response? For starters, expressivists can, and should, point out that the Moral Attitude Problem is not unique to their view. Indeed, those who think that ethical claims express cognitive states, like beliefs—namely, ethical cognitivists—face a very similar challenge: Jones believes both that lying is wrong and that Quentin Tarantino movies are bad, but only one of these counts as a moral belief; what is it, exactly, that distinguishes the moral from the non-moral belief? Cognitivists will say that the one belief has a moral proposition as its content, whereas the other belief does not. But that just pushes the question back a step: what, now, is it that distinguishes the moral from the non-moral proposition? Whether it be a matter of spelling out the difference between moral and non-moral beliefs, or that between moral and non-moral propositions, cognitivists are no less burdened to give an account of the nature of moral thinking than are ethical expressivists.

In fact, Köhler argues that expressivists can actually take what are essentially the same routes in response to the Moral Attitude Problem as those taken by cognitivists. Cognitivists, he thinks, have just two options: they can either (a) characterize the nature of moral thinking by reference to some realm of sui generis moral facts which, when they are the objects of beliefs, make those beliefs moral beliefs, or else (b) do the same, but without positing a realm of sui generis moral facts, and instead identifying moral facts with some set of non-moral facts. Similarly, it seems expressivists have two options: they can either (a) say that “the moral attitude” is some sui generis state of mind, or else (b) insist that “the moral attitude” can be analyzed in terms of non-cognitive mental states with which we are already familiar, like desires and aversions, approval and disapproval, and so forth.

The second of these options for expressivists is certainly the more popular of the two. But according to Köhler, if expressivists are to be successful in taking this approach, they ought to conceive of the identity between “the moral attitude” and other, more familiar non-cognitive states in much the same way that naturalistic moral realists conceive of the identity between moral and non-moral facts—that is, either by insisting that the identity is synthetic a posteriori, as the so-called “Cornell realists” do with moral and non-moral facts, or by insisting that the identity is conceptual, but non-obvious, an approach to conceptual analysis proposed by David Lewis, and recently taken up by a few philosophers from Canberra. Otherwise, if an expressivist is comfortable allowing for a sui generis non-cognitive mental state to hold the place of “the moral attitude,” she should get to work explaining what this state is like. Indeed, Köhler argues that this can be done without violating expressivism’s long-standing commitment to metaphysical naturalism (see Köhler 2013, pp.495-507).

b. Hybrid Theories

Perhaps the most exciting of recent trends in the expressivism literature is the advent of so-called “hybrid” expressivist theories. The idea behind hybrid theories, very basically, is that we might be able to secure all of the advantages of both expressivism and cognitivism by allowing that ethical claims express both non-cognitive and cognitive mental states. Why call them hybrid expressivist views, then, and not hybrid cognitivist views? Recall that the two central theses of ethical expressivism are psychological non-cognitivism—the thesis that ethical claims express mental states that are characteristically non-cognitive—and semantic ideationalism—the thesis that the meanings of ethical claims are to be understood in terms of the mental states that they express. Since neither of these theses state that ethical claims express only non-cognitive states, the hybrid theorist can affirm both of them whole-heartedly. For that reason, hybrid theories are generally considered to be forms of expressivism.

The idea that a single claim might express two distinct mental states is not a new one. Philosophers of language have long thought, for instance, that slurs and pejoratives are capable of doing this. Consider the term “yankee” as used by people living in the American South. In most cases, among Southerners, to call someone a “yankee” is to express a certain sort of negative attitude toward the person. But importantly, the term “yankee” cannot apply to just anyone, rather, it applies only to people who are from the North. Acordingly, when native Southerner Roy says, “Did you hear? Molly’s dating a yankee!” he expresses both (a) a belief that Molly’s partner is from the North, and (b) a negative attitude toward Molly’s partner. It seems we need to suppose that Roy has and expresses both of these states—one cognitive, the other non-cognitive—in order to make adequate sense of the meaning of his claim. In much the same way, hybrid theorists in metaethics suggest that ethical claims can express both beliefs and attitudes. Indeed, these philosophers often model their theories on an analogy to the nature of slurs and pejoratives (see Hay 2013).

Even within the expressivist tradition, the language of hybridity may be new, but the basic idea is not. Recall from earlier that Hare believed ethical claims have two sorts of meaning: descriptive meaning and prescriptive meaning. To claim that something is “good,” he thinks, is to both (a) say or imply that it has some context-specific set of non-moral properties; this is the claim’s descriptive meaning, and (b) commend the thing in virtue of these properties; this is the claim’s prescriptive meaning. This is not far off from a hybrid view according to which “good”-claims express both (a) a belief that something has some property or properties, and (b) a positive non-cognitive attitude toward the thing. Hare was apparently ahead of his time in this respect. The hybrid movement as it is now known is less than a decade old.

One of the earliest notable hybrid views is Ridge’s “ecumenical expressivism” (see Ridge 2006 and 2007). In its initial form, ecumenical expressivism is the view that ethical claims express two closely related mental states—one a belief, and the other a non-cognitive state like approval or disapproval. Furthermore, as an instance of semantic ideationalism, ecumenical expressivism adds that the literal meanings, or semantic contents, of ethical claims are to be understood solely in terms of these mental states. So, for example, the claim

(3) Lying is wrong

expresses something like these two states: (a) disapproval of things that have a certain property F, and (b) a belief that lying has property F. Notably, the view allows for a kind of subjectivity to moral judgment, since the nature of property F will differ from person to person. A utilitarian, for instance, might disapprove of behavior that fails to maximize utility; a Kantian might instead disapprove of behavior that disrespects people’s autonomy; and so on and so forth. Furthermore, Ridge’s view is supposed to be able to solve the Frege-Geach Problem by conceiving of logical inference and validity in terms of the relationships that obtain among beliefs.

(4) If lying is wrong, then getting your little brother to lie is wrong.

According to ecumenical expressivism, complex ethical claims like (4) also express two states: (a) disapproval of things that have a certain property F, and (b) the complex belief that if lying has property F, then getting one’s little brother to lie has property F as well. Coupled with an account of logical validity understood in terms of consistency of beliefs, this looks like a promising way to satisfy Geach’s two challenges. (Ridge has since updated his view so that it is no longer a semantic theory, but rather a meta-semantic theory. Thus, rather than attempting to assign literal meanings to ethical claims, Ridge means only to explain that in virtue of which ethical claims have the meanings that they do. See Ridge 2014.)

The implicature-style views defended by Copp and Finlay also fall within the hybrid camp (Copp 2001, 2009; Finlay 2004, 2005). Coined by philosopher H. Paul Grice, the term “implicature” refers to a semantic phenomenon in which a speaker means or implies one thing, while saying something else. A popular example is that of the professor who writes, “Alex has good handwriting,” in a letter of recommendation. What the professor says is that Alex has good handwriting, but what the professor means or implies is that Alex is not an especially good student. So the claim “Alex has good handwriting” has both a literal content, that Alex has good handwriting, and an implicated content, that Alex is not an especially good student.

In the same way, Copp and Finlay suggest that ethical claims have both literal and implicated contents. Once again:

(3) Lying is wrong

According to these implicature-style views, someone who sincerely utters (3) thereby communicates two things. First, she either expresses a belief, or asserts a proposition, to the effect that lying is wrong—this is the claim’s literal content. Second, she implies that she has some sort of non-cognitive attitude toward lying—this is the claim’s implicated content. It is in this way that implicature-style views are supposed to capture the virtues of both cognitivism and expressivism. Where Copp and Finlay disagree is over the matter of what it is in virtue of which the non-cognitive attitude is implicated. According to Copp, it is a matter of linguistic conventions that govern ethical discourse; whereas Finlay thinks it is a matter of the dynamics of ethical conversation. So Copp’s view is an instance of conventional implicature, while Finlay’s is an instance of conversational implicature.

There may be yet another way to “go hybrid” with one’s expressivism. Rather than hybridizing the mental state(s) expressed by ethical claims, one might instead hybridize the very notion of expression itself. This is the route taken by defenders of a view known as ‘ethical neo-expressivism’ (Bar-On and Chrisman 2009; Bar-On, Chrisman, and Sias 2014). Ethical neo-expressivism rests upon two very important distinctions. The first is a distinction between two different kinds of expression. When we say that agents express their mental states and that sentences express propositions, we refer not just to two different instances of expression, but more importantly, to two different kinds expression, which are often conflated by expressivists. To see how the two kinds of expression come apart, consider:

(18) It is so great to see you!

(19) I am so glad to see you!

Intuitively, these two sentences have different semantic contents. Setting aside complicated issues related to indexicality, sentence (18) expresses the proposition that it is so great to see you (the addressee), and sentence (19) expresses the proposition that I (the speaker) am so glad to see you (the addressee). However, these two different sentences might nonetheless function as vehicles for expressing the same mental state, that is, I might express my gladness or joy at seeing a friend by uttering either of them. Indeed, I might also do so by hugging my friend, or even just by smiling. Importantly, the neo-expressivist urges, it is not the speaker who expresses this or that proposition, but the sentences. People cannot express propositions, but sentences can, in virtue of being conventional representations of them. However, it is not the sentences that express gladness or joy, but the speaker. Sentences cannot express mental states; they are just strings of words. But people can certainly express their mental states by performing various acts, some of which involve the utterance of sentences. Call the relation between sentences and propositions semantic-expression, or s-expression; and call the relation between agents and their mental states action-expression, or a-expression.

According to neo-expressivists, most ethical expressivists, including most hybrid theorists, conflate these two senses of expression because they fail to adequately recognize a second distinction. Notice that terms like “claim”, “judgment”, and “statement” are ambiguous: they might refer either to an act or to the product of that act. So the term “ethical claim” might refer either to the act of making an ethical claim, or to the product of this act—which, presumably, is a sentence tokened either in thought or in speech. This distinction between ethical claims understood as acts and ethical claims understood as products maps nicely onto the earlier distinction between a- and s-expression. Understood as acts, ethical claims are different from non-ethical claims in that, when making an ethical claim, a speaker a-expresses some non-cognitive attitude. In this way, neo-expressivists can apparently affirm psychological non-cognitivism, and may also have the resources for “saving the differences” between ethical and non-ethical discourse. On the other hand, understood as products—that is, sentences containing ethical terms—ethical claims are just like non-ethical claims in s-expressing propositions, and not necessarily in the deflationary sense of proposition noted above. By allowing that ethical claims express propositions, the neo-expressivist may have all she needs in order to avoid the Continuity Problem.

Now, according to some, semantic ideationalism is essential to expressivism. Gibbard, for instance, writes, “The term ‘expressivism’ I mean to cover any account of meanings that follow this indirect path: to explain the meaning of a term, explain what states of mind the term can be used to express” (2003, p.7). However, ethical neo-expressivism, as we have just seen, rejects semantic ideationalism in favor of the more traditional propositional approach to meaning. In light of this, one might legitimately wonder whether neo-expressivism ought to count as an expressivist view. But as Bar-On, Chrisman, and Sias (2014) argue, neo-expressivism is perfectly capable of accommodating both of the main motivations of non-cognitivism and expressivism described in Sections 1 and 3—that is, avoiding a commitment to “spooky,” irreducibly normative properties, and accounting for the close connection between sincere ethical claims and motivation. Besides, as we saw earlier, it looks like the expressivist’s commitment to semantic ideationalism is what got her into trouble with the Continuity Problem in the first place. So even if neo-expressivism represents something of a departure from mainstream expressivist thought, it may nonetheless be a departure worth considering.

c. Recent Work in Empirical Moral Psychology

Expressivists have long recognized that it is possible to make an ethical claim without being in whatever is supposed to be the corresponding non-cognitive mental state. It is possible, for instance, to utter

(3) Lying is wrong

without, at the same time, disapproving of lying. Maybe the speaker is just reciting a line from a play; or maybe the speaker suffers from a psychological disorder that renders him incapable of ever being in the relevant non-cognitive state, and he is just repeating something that he has heard others say. These are surely possibilities, and expressivists have at times had different things to say about them, and other cases like them. Either way, though, expressivists generally assume that ethical claims are nonetheless tied to non-cognitive states in a way that justifies us in thinking that a speaker of an ethical claim, if she is being sincere, ought to be motivated to act accordingly. This is one of the two main motivations that attract people to theories in the expressivist tradition.

The assumption that sincere ethical claims in ordinary cases are accompanied by non-cognitive states is presumably one that has empirical implications. If true, for instance, one might expect to find activity in regions of the brain associated with such states as people make ethical claims sincerely. Indeed, this is precisely what researchers in empirical moral psychology have found throughout various studies conducted over the past few decades. From brain scans to behavioral experiments, tests of skin conductance to moral judgment surveys given in disgusting environments, study after study seems to confirm a view that is sometimes called “psychological sentimentalism”—that is, the view that people are prompted to make the ethical claims that they make primarily by their emotional responses to things.

Now, to be sure, the link posited by psychological sentimentalism is a causal one—our emotions cause us to make certain ethical claims—and that is importantly different from the conceptual link that expressivists generally assume exists between non-cognitive states and ethical claims. But expressivists may nonetheless benefit from exploring how recent work in empirical moral psychology can be used to support parts of their view—for example, how it is that the conceptual link is supposed to have come about. If nothing else, expressivists may find significant empirical support for the view, shared by everyone in the tradition since Ayer, that ethical claims are expressions of characteristically non-cognitive states of mind.

6. References and Further Reading

Austin, J. L. (1970). “Other Minds.” In J. O. Urmson and G. J. Warnock (eds.), Philosophical Papers. Second Edition. Oxford: Clarendon Press.
Ayer, A. J. (1946/1952). Language, Truth, and Logic. New York: Dover.
Barker, S. (2006). “Truth and the Expressing in Expressivism.” In Horgan, T. and Timmons, M. (eds.). Metaethics after Moore. Oxford: Clarendon Press.
Bar-On, D. and M. Chrisman (2009). “Ethical Neo-Expressivism.” In R. Shafer-Landau (ed.). Oxford Studies in Metaethics, Vol. 4. Oxford: Oxford University Press.
Bar-On, D., M. Chrisman, and J. Sias (2014). “(How) Is Ethical Neo-Expressivism a Hybrid View.” In M. Ridge and G. Fletcher (eds.), Having It Both Ways: Hybrid Theories and Modern Metaethics. Oxford: Oxford University Press.
Björnsson, G. and T. McPherson (2014). “Moral Attitudes for Non-Cognitivists: Solving the Specification Problem,” Mind,
Blackburn, S. (1984). Spreading the Word. Oxford: Clarendon Press.
Blackburn, S. (1988a). “Attitudes and Contents.” Ethics 98: 501-17.
Blackburn, S. (1988b). “Supervenience Revisited.” In G. Sayre-McCord (ed.), Essays on Moral Realism. Ithaca: Cornell University Press.
Blackburn, S. (1998). Ruling Passions. Oxford: Clarendon Press.
Boisvert, D. (2008). “Expressive-Assertivism.” Pacific Philosophical Quarterly 89(2): 169-203.
Boyd, R. (1988). “How to Be a Moral Realist.” In G. Sayre-McCord (ed.), Essays on Moral Realism. Ithaca: Cornell University Press.
Brink, D. (1989). Moral Realism and the Foundations of Ethics. Cambridge: Cambridge University Press.
Chrisman, M. (2008). “Expressivism, Inferentialism, and Saving the Debate.” Philosophy and Phenomenological Research 77: 334-358.
Chrisman, M. (2012). “Epistemic Expressivism.” Philosophy Compass 7(2): 118-126.
Copp, D. (2001). “Realist-Expressivism: A Neglected Option for Moral Realism.” Social Philosophy and Policy 18(2): 1-43.
Copp, D. (2009). “Realist-Expressivism and Conventional Implicature.” In Shafer-Landau, R. (ed.). Oxford Studies in Metaethics, Vol. 4. Oxford: Oxford University Press.
Coventry, A. (2006). Hume’s Theory of Causation: A Quasi-Realist Interpretation. London: Continuum.
Divers, J. and A. Miller (1994). “Why Expressivists About Value Should Not Love Minimalism About Truth.” Analysis 54: 12-19.
Dreier, J. (2004). “Meta-Ethics and the Problem of Creeping Minimalism.” Philosophical Perspectives 18: 23-44.
Dreier, J. (2009). “Relativism (and Expressivism) and the Problem of Disagreement.” Philosophical Perspectives 23: 79-110.
Finlay, S. (2004). “The Conversational Practicality of Value Judgment.” The Journal of Ethics 8: 205-223.
Finlay, S. (2005). “Value and Implicature.” Philosophers’ Imprint 5: 1-20.
Geach, P. (1965). “Assertion.” Philosophical Review 74: 449-465.
Gert, J. (2002). “Expressivism and Language Learning.” Ethics 112: 292-314.
Gibbard, A. (1990). Wise Choices, Apt Feelings. Cambridge, MA: Harvard University Press.
Gibbard, A. (2003). Thinking How to Live. Cambridge, MA: Harvard University Press.
Greene, J. D. (2008). “The Secret Joke of Kant’s Soul.” In Walter Sinnott-Armstrong (ed.), Moral Psychology, Vol. 3: The Neuroscience of Morality: Emotion, Brain Disorders, and Development. Cambridge, MA: MIT Press, pp. 35-79.
Greene, J. D. and J. Haidt (2002). “How (and Where) Does Moral Judgment Work?” Trends in Cognitive Sciences 6: 517-523.
Haidt, J. (2001). “The Emotional Dog and Its Rational Tail: A Social Intuitionist Approach to Moral Judgment.” Psychological Review 108(4): 814-834.
Hare, R. M. (1952). The Language of Morals. Oxford: Oxford University Press.
Hare, R. M. (1970). “Meaning and Speech Acts.” The Philosophical Review 79: 3-24.
Hay, Ryan. (2013). “Hybrid Expressivism and the Analogy between Pejoratives and Moral Language.” European Journal of Philosophy 21(3): 450-474.
Horwich, P. (1998). Truth. Second Edition. Oxford: Blackwell.
Jackson, F. (1998). From Metaphysics to Ethics. Oxford: Clarendon Press.
Jackson, F. and P. Pettit (1995). “Moral Functionalism and Moral Motivation.” Philosophical Quarterly 45: 20-40.
Kauppinen, A. (2010). “What Makes a Sentiment Moral?” In R. Shafer-Landau (ed.), Oxford Studies in Metaethics, Vol. 5. Oxford: Oxford University Press.
Köhler, S. (2013). “Do Expressivists Have an Attitude Problem?” Ethics 123(3): 479-507.
Lewis, D. (1970). “How to Define Theoretical Terms.” Journal of Philosophy 67: 427-446.
Lewis, D. (1972). “Psychophysical and Theoretical Identifications.” Australasian Journal of Philosophy 50: 249-258.
Mackie, J. L. (1977). Ethics: Inventing Right and Wrong. London: Penguin.
Merli, D. (2008). “Expressivism and the Limits of Moral Disagreement.” Journal of Ethics 12: 25-55.
Miller, A. (2013). Contemporary Metaethics: An Introduction. Second Edition. Cambridge: Polity.
Moore, G. E. (1903). Principia Ethica. New York: Cambridge University Press.
Nichols, S. (2004). Sentimental Rules: On the Natural Foundations of Moral Judgment. New York: Oxford University Press.
Ogden, C. K. and I. A. Richards (1923). The Meaning of Meaning. New York: Harcourt Brace & Jovanovich.
Price, H. (2011). “Expressivism for Two Voices.” In J. Knowles and H. Rydenfelt (eds.). Pragmatism, Science, and Naturalism. Peter Lang.
Prinz, J. (2006). “The Emotional Basis of Moral Judgments.” Philosophical Explorations 9(1): 29-43.
Ramsey, F. P. (1927). “Facts and Propositions.” Proceedings of the Aristotelian Society 7 (Supplementary): 153-170.
Ridge, M. (2006). “Ecumenical Expressivism: Finessing Frege.” Ethics 116: 302-336.
Ridge, M. (2007). “Ecumenical Expressivism: The Best of Both Worlds?” In Shafer-Landau, R. (ed.). Oxford Studies in Metaethics, Vol. 2. Oxford: Oxford University Press.
Ridge, M. (2014). Impassioned Belief. Oxford: Oxford University Press.
Ridge, M. and G. Fletcher, eds. (2014). Having It Both Ways: Hybrid Theories and Modern Metaethics. Oxford: Oxford University Press.
Schroeder, M. (2008a). Being For: Evaluating the Semantic Program of Expressivism. Oxford: Oxford University Press.
Schroeder, M. (2008b). “Expression for Expressivists.” Philosophy and Phenomenological Research 76(1): 86-116.
Schroeder, M. (2008c). “How Expressivists Can and Should Solve Their Problem with Negation.” Noûs 42(4): 573-599.
Schroeder, M. (2008d). “What is the Frege-Geach Problem?” Philosophy Compass 3(4): 703-720.
Schroeder, M. (2009). “Hybrid Expressivism: Virtues and Vices.” Ethics 119(2): 257-309.
Searle, J. (1962). “Meaning and Speech Acts.” Philosophical Review 71 (1962): 423-32.
Searle, J. (1969). Speech Acts: An Essay in the Philosophy of Language. Cambridge: Cambridge University Press.
Smith, M. (1994a). The Moral Problem. Oxford: Blackwell.
Smith, M. (1994b). “Why Expressivists About Value Should Love Minimalism About Truth.” Analysis 54: 1-12.
Smith, M. (2001). “Some Not-Much-Discussed Problems for Non-Cognitivism in Ethics.” Ratio 14: 93-115.
Stevenson, C. L. (1937). “The Emotive Meaning of Ethical Terms.” Mind 46: 14-31.
Stevenson, C. L. (1944). Ethics and Language. New Haven: Yale University Press.
Strawson, P. F. (1949). “Truth.” Analysis 9: 83-97.
Thomson, J. (1996). “Moral Objectivity.” In G. Harman and J. Thomson, Moral Relativism and Moral Objectivity (Great Debates in Philosophy). Oxford: Blackwell.
Unwin, N. (1999). “Quasi-Realism, Negation and the Frege-Geach Problem.” Philosophical Quarterly 49: 337-352.
Unwin, N. (2001). “Norms and Negation: A Problem for Gibbard’s Logic.” Philosophical Quarterly 51: 60-75.
Yalcin, S. (2012). “Bayesian Expressivism.” Proceedings of the Aristotelian Society CXII(2): 123-160.

Author Information

James Sias
Email: siasj@dickinson.edu
Dickinson College
U. S. A.

Gottfried Leibniz: Philosophy of Mind

Gottfried Wilhelm Leibniz (1646-1716) was a true polymath: he made substantial contributions to a host of different fields such as mathematics, law, physics, theology, and most subfields of philosophy. Within the philosophy of mind, his chief innovations include his rejection of the Cartesian doctrines that all mental states are conscious and that non-human animals lack souls as well as sensation. Leibniz’s belief that non-rational animals have souls and feelings prompted him to reflect much more thoroughly than many of his predecessors on the mental capacities that distinguish human beings from lower animals. Relatedly, the acknowledgment of unconscious mental representations and motivations enabled Leibniz to provide a far more sophisticated account of human psychology. It also led Leibniz to hold that perception—rather than consciousness, as Cartesians assume—is the distinguishing mark of mentality.

The capacities that make human minds superior to animal souls, according to Leibniz, include not only their capacity for more elevated types of perceptions or mental representations, but also their capacity for more elevated types of appetitions or mental tendencies. Self-consciousness and abstract thought are examples of perceptions that are exclusive to rational souls, while reasoning and the tendency to do what one judges to be best overall are examples of appetitions of which only rational souls are capable. The mental capacity for acting freely is another feature that sets human beings apart from animals and it in fact presupposes the capacity for elevated kinds of perceptions as well as appetitions.

Another crucial contribution to the philosophy of mind is Leibniz’s frequently cited mill argument. This argument is supposed to show, through a thought experiment that involves walking into a mill, that material things such as machines or brains cannot possibly have mental states. Only immaterial things, that is, soul-like entities, are able to think or perceive. If this argument succeeds, it shows not only that our minds must be immaterial or that we must have souls, but also that we will never be able to construct a computer that can truly think or perceive.

Finally, Leibniz’s doctrine of pre-established harmony also marks an important innovation in the history of the philosophy of mind. Like occasionalists, Leibniz denies any genuine interaction between body and soul. He agrees with them that the fact that my foot moves when I decide to move it, as well as the fact that I feel pain when my body gets injured, cannot be explained by a genuine causal influence of my soul on my body, or of my body on my soul. Yet, unlike occasionalists, Leibniz also rejects the idea that God continually intervenes in order to produce the correspondence between my soul and my body. That, Leibniz thinks, would be unworthy of God. Instead, God has created my soul and my body in such a way that they naturally correspond to each other, without any interaction or divine intervention. My foot moves when I decide to move it because this motion has been programmed into it from the very beginning. Likewise, I feel pain when my body is injured because this pain was programmed into my soul. The harmony or correspondence between mental states and states of the body is therefore pre-established.

Leibnizian Minds and Mental States
1. Perceptions
  1. Consciousness, Apperception, and Reflection
  2. Abstract Thought, Concepts, and Universal Truths
2. Appetitions
Freedom
The Mill Argument
The Relation between Mind and Body
References and Further Reading
1. Primary Sources in English Translation
2. Secondary Sources

1. Leibnizian Minds and Mental States

Leibniz is a panpsychist: he believes that everything, including plants and inanimate objects, has a mind or something analogous to a mind. More specifically, he holds that in all things there are simple, immaterial, mind-like substances that perceive the world around them. Leibniz calls these mind-like substances ‘monads.’ While all monads have perceptions, however, only some of them are aware of what they perceive, that is, only some of them possess sensation or consciousness. Even fewer monads are capable of self-consciousness and rational perceptions. Leibniz typically refers to monads that are capable of sensation or consciousness as ‘souls,’ and to those that are also capable of self-consciousness and rational perceptions as ‘minds.’ The monads in plants, for instance, lack all sensation and consciousness and are hence neither souls nor minds; Leibniz sometimes calls this least perfect type of monad a ‘bare monad’ and compares the mental states of such monads to our states when we are in a stupor or a dreamless sleep. Animals, on the other hand, can sense and be conscious, and thus possess souls (see Animal Minds). God and the souls of human beings and angels, finally, are examples of minds because they are self-conscious and rational. As a result, even though there are mind-like things everywhere for Leibniz, minds in the stricter sense are not ubiquitous.

All monads, even those that lack consciousness altogether, have two basic types of mental states: perceptions, that is, representations of the world around them, and appetitions, or tendencies to transition from one representation to another. Hence, even though monads are similar to the minds or souls described by Descartes in some ways—after all, they are immaterial substances—consciousness is not an essential property of monads, while it is an essential property of Cartesian souls. For Leibniz, then, the distinguishing mark of mentality is perception, rather than consciousness (see Simmons 2001). In fact, even Leibnizian minds in the stricter sense, that is, monads capable of self-consciousness and reasoning, are quite different from the minds in Descartes’s system. While Cartesian minds are conscious of all their mental states, Leibnizian minds are conscious only of a small portion of their states. To us it may seem obvious that there is a host of unconscious states in our minds, but in the seventeenth century this was a radical and novel notion. This profound departure from Cartesian psychology allows Leibniz to paint a much more nuanced picture of the human mind.

One crucial aspect of Leibniz’s panpsychism is that in addition to the rational monad that is the soul of a human being, there are non-rational, bare monads everywhere in the human being’s body. Leibniz sometimes refers to the soul of a human being or animal as the central or dominant monad of the organism. The bare monads that are in an animal’s body, accordingly, are subordinate to its dominant monad or soul. Even plants, for Leibniz, have central or dominant monads, but because they lack sensation, these dominant monads cannot strictly speaking be called souls. They are merely bare monads, like the monads that are subordinate to them.

The claim that there are mind-like things everywhere in nature—in our bodies, in plants, and even in inanimate objects—strikes many readers of Leibniz as ludicrous. Yet, Leibniz thinks he has conclusive metaphysical arguments for this claim. Very roughly, he holds that a complex, divisible thing such as a body can only be real if it is made up of parts that are real. If the parts in turn have parts, those have to be real as well. The problem is, Leibniz claims, that matter is infinitely divisible: we can never reach parts that do not themselves have parts. Even if there were material atoms that we cannot actually divide, they must still be spatially extended, like all matter, and therefore have spatial parts. If something is spatially extended, after all, we can at least in thought distinguish its left half from its right half, no matter how small it is. As a result, Leibniz thinks, purely material things are not real. The reality of complex wholes depends on the reality of their parts, but with purely material things, we never get to parts that are real since we never reach an end in this quest for reality. Leibniz concludes that there must be something in nature that is not material and not divisible, and from which all things derive their reality. These immaterial, indivisible things just are monads. Because of the role they play, Leibniz sometimes describes them as “atoms of substance, that is, real unities absolutely destitute of parts, […] the first absolute principles of the composition of things, and, as it were, the final elements in the analysis of substantial things” (p. 142. For a more thorough description of monads, see Leibniz: Metaphysics, as well as the Monadology and the New System of Nature, both included in Ariew and Garber.)

a. Perceptions

As already seen, all monads have perceptions, that is, they represent the world around them. Yet, not all perceptions—not even all the perceptions of minds—are conscious. In fact, Leibniz holds that at any given time a mind has infinitely many perceptions, but is conscious only of a very small number of them. Even souls and bare monads have an infinity of perceptions. This is because Leibniz believes, for reasons that need not concern us here (but see Leibniz: Metaphysics), that each monad constantly perceives the entire universe. For instance, even though I am not aware of it at all, my mind is currently representing every single grain of sand on Mars. Even the monads in my little toe, as well as the monads in the apple I am about to eat, represent those grains of sand.

Leibniz often describes perceptions of things of which the subject is unaware and which are far removed from the subject’s body as ‘confused.’ He is fond of using the sound of the ocean as a metaphor for this kind of confusion: when I go to the beach, I do not hear the sound of each individual wave distinctly; instead, I hear a roaring sound from which I am unable to discern the sounds of the individual waves (see Principles of Nature and Grace, section 13, in Ariew and Garber, 1989). None of these individual sounds stands out. Leibniz claims that confused perceptions in monads are analogous to this confusion of sounds, except of course for the fact that monads do not have to be aware even of the confused whole. To the extent that a perception does stand out from the rest, however, Leibniz calls it ‘distinct.’ This distinctness comes in degrees, and Leibniz claims that the central monads of organisms always perceive their own bodies more distinctly than they perceive other bodies.

Bare monads are not capable of very distinct perceptions; their perceptual states are always muddled and confused to a high degree. Animal souls, on the other hand, can have much more distinct perceptions than bare monads. This is in part because they possess sense organs, such as eyes, which allow them to bundle and condense information about their surroundings (see Principles of Nature and Grace, section 4). The resulting perceptions are so distinct that the animals can remember them later, and Leibniz calls this kind of perception ‘sensation.’ The ability to remember prior perceptions is extremely useful, as a matter of fact, because it enables animals to learn from experience. For instance, a dog that remembers being beaten with a stick can learn to avoid sticks in the future (see Principles of Nature and Grace, section 5, in Ariew and Garber, 1989). Sensations are also tied to pleasure and pain: when an animal distinctly perceives some imperfection in its body, such as a bruise, this perception just is a feeling of pain. Similarly, when an animal perceives some perfection of its body, such as nourishment, this perception is pleasure. Unlike Descartes, then, Leibniz believed that animals are capable of feeling pleasure and pain.

Consequently, souls differ from bare monads in part through the distinctness of their perceptions: unlike bare monads, souls can have perceptions that are distinct enough to give rise to memory and sensation, and they can feel pleasure and pain. Rational souls, or minds, share these capacities. Yet they are additionally capable of perceptions of an even higher level. Unlike the souls of lower animals, they can reflect on their own mental states, think abstractly, and acquire knowledge of necessary truths. For instance, they are capable of understanding mathematical concepts and proofs. Moreover, they can think of themselves as substances and subjects: they have the ability to use and understand the word ‘I’ (see Monadology, section 30). These kinds of perceptions, for Leibniz, are distinctively rational perceptions, and they are exclusive to minds or rational souls.

It is clear, then, that there are different types of perceptions: some are unconscious, some are conscious, and some constitute reflection or abstract thought. What exactly distinguishes these types of perceptions, however, is a complicated question that warrants a more detailed investigation.

i. Consciousness, Apperception, and Reflection

Why are some perceptions conscious, while others are not? In one text, Leibniz explains the difference as follows: “it is good to distinguish between perception, which is the internal state of the monad representing external things, and apperception, which is consciousness, or the reflective knowledge of this internal state, something not given to all souls, nor at all times to a given soul” (Principles of Nature and Grace, section 4). This passage is interesting for several reasons: Leibniz not only equates consciousness with what he calls ‘apperception,’ and states that only some monads possess it. He also seems to claim that conscious perceptions differ from other perceptions in virtue of having different types of things as their objects: while unconscious perceptions represent external things, apperception or consciousness has perceptions, that is, internal things, as its object. Consciousness is therefore closely connected to reflection, as the term ‘reflective knowledge’ also makes clear.

The passage furthermore suggests that Leibniz understands consciousness in terms of higher-order mental states because it says that in order to be conscious of a perception, I must possess “reflective knowledge” of that perception. One way of interpreting this statement is to understand these higher-order mental states as higher-order perceptions: in order to be conscious of a first-order perception, I must additionally possess a second-order perception of that first-order perception. For example, in order to be conscious of the glass of water in front of me, I must not only perceive the glass of water, but I must also perceive my perception of the glass of water. After all, in the passage under discussion, Leibniz defines ‘consciousness’ or ‘apperception’ as the reflective knowledge of a perception. Such higher-order theories of consciousness are still endorsed by some philosophers of mind today (see Consciousness). For an alternative interpretation of Leibniz’s theory of consciousness, however, see Jorgensen 2009, 2011a, and 2011b).

There is excellent textual evidence that according to Leibniz, consciousness or apperception is not limited to minds, but is instead shared by animal souls. One passage in which Leibniz explicitly ascribes apperception to animals is from the New Essays: “beasts have no understanding … although they have the faculty for apperceiving the more conspicuous and outstanding impressions—as when a wild boar apperceives someone who is shouting at it” (p. 173). Moreover, Leibniz sometimes claims that sensation involves apperception (e.g. New Essays p. 161; p. 188), and since animals are clearly capable of sensation, they must thus possess some form of apperception. Hence, it seems that Leibniz ascribes apperception to animals, which in turn he elsewhere identifies with consciousness.

Yet, the textual evidence for animal consciousness is unfortunately anything but neat because in the New Essays—that is, in the very same text—Leibniz also suggests that there is an important difference between animals and human beings somewhere in this neighborhood. In several passages, he says that any creature with consciousness has a moral or personal identity, which in turn is something he grants only to minds. He states, for instance, that “consciousness or the sense of I proves moral or personal identity” (New Essays, p. 236). Hence, it seems clear that for Leibniz there is something in the vicinity of consciousness that animals lack and that minds possess, and which is crucial for morality.

A promising solution to this interpretive puzzle is the following: what animals lack is not consciousness generally, but only a particular type of consciousness. More specifically, while they are capable of consciously perceiving external things, they lack awareness, or at least a particular type of awareness, of the self. In the Monadology, for instance, Leibniz argues that knowledge of necessary truths distinguishes us from animals and that through this knowledge “we rise to reflexive acts, which enable us to think of that which is called ‘I’ and enable us to consider that this or that is in us” (sections 29-30). Similarly, he writes in the Principles of Nature and Grace that “minds … are capable of performing reflective acts, and capable of considering what is called ‘I’, substance, soul, mind—in brief, immaterial things and immaterial truths” (section 5). Self-knowledge, or self-consciousness, then, appears to be exclusive to rational souls. Leibniz moreover connects this consciousness of the self to personhood and moral responsibility in several texts, such as for instance in the Theodicy: “In saying that the soul of man is immortal one implies the subsistence of what makes the identity of the person, something which retains its moral qualities, conserving the consciousness, or the reflective inward feeling, of what it is: thus it is rendered susceptible to chastisement or reward” (section 89).

Based on these passages, it seems that one crucial cognitive difference between human beings and animals is that even though animals possess the kind of apperception that is involved in sensation and in an acute awareness of external objects, they lack a certain type of apperception or consciousness, namely reflective self-knowledge or self-consciousness. Especially because of the moral implications of this kind of consciousness that Leibniz posits, this difference is clearly an extremely important one. According to these texts, then, it is not consciousness or apperception tout court that distinguishes minds from animal souls, but rather a particular kind of apperception. What animals are incapable of, according to Leibniz, is self-knowledge or self-awareness, that is, an awareness not only of their perceptions, but also of the self that is having those perceptions.

Because Leibniz associates consciousness so closely with reflection, one might wonder whether the fact that animals are capable of conscious perceptions implies that they are also capable of reflection. This is another difficult interpretive question because there appears to be evidence both for a positive and for a negative answer. Reflection, according to Leibniz, is “nothing but attention to what is within us” (New Essays, p. 51). Moreover, as already seen, he argues that reflective acts enable us “to think of that which is called ‘I’ and … to consider that this or that is in us” (Monadology, section 30). Leibniz does not appear to ascribe reflection to animals explicitly, and in fact, there are several texts in which he says in no uncertain terms that they lack reflection altogether. He states for instance that “the soul of a beast has no more reflection than an atom” (Loemker, p. 588). Likewise, he defines ‘intellection’ as “a distinct perception combined with a faculty of reflection, which the beasts do not have” (New Essays, p. 173) and explains that “just as there are two sorts of perception, one simple, the other accompanied by reflections that give rise to knowledge and reasoning, so there are two kinds of souls, namely ordinary souls, whose perception is without reflection, and rational souls, which think about what they do” (Strickland, p. 84).

On the other hand, as seen, Leibniz does ascribe apperception or consciousness to animals, and consciousness in turn appears to involve higher-order mental states. This suggests that Leibnizian animals must perceive or know their own perceptions when they are conscious of something, and that in turn seems to imply that they can reflect after all. A closely related reason for ascribing reflection to animals is that Leibniz sometimes explicitly associates reflection with apperception or consciousness. In a passage already quoted above, for instance, Leibniz defines ‘consciousness’ as the reflective knowledge of a first-order perception. Hence, if animals possess consciousness it seems that they must also have some type of reflection.

We are consequently faced with an interpretive puzzle: even though there is strong indirect evidence that Leibniz attributes reflection to animals, there is also direct evidence against it. There are at least two ways of solving this puzzle. In order to make sense of passages in which Leibniz restricts reflection to rational souls, one can either deny that perceiving one’s internal states is sufficient for reflection, or one can distinguish between different types of reflection, in such a way that the most demanding type of reflection is limited to minds. One good way to deny that perception of one’s internal states is sufficient for reflection is to point out that Leibniz defines reflection as “attention to what is within us” (New Essays, p. 51), rather than as ‘perception of what is within us.’ Attention to internal states, arguably, is more demanding than mere perception of these states, and animals may well be incapable of the former. Attention might be a particularly distinct perception, for instance. Alternatively, one can argue that reflection requires a self-concept, or self-knowledge, which also goes beyond the mere perception of internal states and may be inaccessible to animals. Perceiving my internal states, on that interpretation, amounts to reflection only if I also possess knowledge of the self that is having those states. Instead of denying that perceiving one’s own states is sufficient for reflection, one can also distinguish different types of reflection and claim that while the mere perception of one’s internal states is a type of reflection, there is a more demanding type of reflection that requires attention, a self-concept, or something similar. Yet, the difference between those two responses appears to be merely terminological. Based on the textual evidence discussed above, it is clear that either reflection generally, or at least a particular type of reflection, must be exclusive to minds.

ii. Abstract Thought, Concepts, and Universal Truths

So far, we have seen that one cognitive capacity that elevates minds above animal souls is self-consciousness, which is a particular type of reflection. Before turning to appetitions, we should briefly investigate three additional, mutually related, cognitive abilities that only minds possess, namely the abilities to abstract, to form or possess concepts, and to know general truths. In what may well be Leibniz’s most intriguing discussion of abstraction, he says that some non-human animals “apparently recognize whiteness, and observe it in chalk as in snow; but it does not amount to abstraction, which requires attention to the general apart from the particular, and consequently involves knowledge of universal truths which beasts do not possess” (New Essays, p. 142). In this passage, we learn not only that beasts are incapable of abstraction, but also that abstraction involves “attention to the general apart from the particular” as well as “knowledge of universal truths.” Hence, abstraction for Leibniz seems to consist in separating out one part of a complex idea and focusing on it exclusively. Instead of thinking of different white things, one must think of whiteness in general, abstracting away from the particular instances of whiteness. In order to think about whiteness in the abstract, then, it is not enough to perceive different white things as similar to one another.

Yet, it might still seem mysterious how precisely animals should be able to observe whiteness in different objects if they are unable to abstract. One fact that makes this less mysterious, however, is that, on Leibniz’s view, while animals are unable to pay attention to whiteness in general, the idea of whiteness may nevertheless play a role in their recognition of whiteness. As Leibniz explains in the New Essays, even though human minds are aware of complex ideas and particular truths first as well as rather easily, and have to expend a lot of effort to subsequently achieve awareness of simple ideas and general principles, the order of nature is the other way around:

The truths that we start by being aware of are indeed particular ones, just as we start with the coarsest and most composite ideas. But that doesn’t alter the fact that in the order of nature the simplest comes first, and that the reasons for particular truths rest wholly on the more general ones of which they are mere instances. … The mind relies on these principles constantly; but it does not find it so easy to sort them out and to command a distinct view of each of them separately, for that requires great attention to what it is doing. (p. 83f.)

Here, Leibniz says that minds can rely on general principles, or abstract ideas, without being aware of them, and without having distinct perceptions of them separately. This might help us to explain how animals can observe whiteness in different white objects without being able to abstract: the simple idea of whiteness might play a role in their cognition, even though they are not aware of it, and are unable to pay attention to this idea.

The passage just quoted is interesting for another reason: It shows that abstracting and achieving knowledge of general truths have a lot in common and presuppose the capacity to reflect. It takes a special effort of mind to become aware of abstract ideas and general truths, that is, to separate these out from complex ideas and particular truths. It is this special effort, it seems, of which animals are incapable; while they can at times achieve relatively distinct perceptions of complex or particular things, they lack the ability to pay attention, or at least sufficient attention, to their internal states. At least part of the reason for their inability to abstract and to know general truths, then, appears to be their inability, or at least very limited ability, to reflect.

Abstraction also seems closely related to the possession or formation of concepts: arguably, what a mind gains when abstracting the idea of whiteness from the complex ideas of particular white things is what we would call a concept of whiteness. Hence, since animals cannot abstract, they do not possess such concepts. They may nevertheless, as suggested above, have confused ideas such as a confused idea of whiteness that allows them to recognize whiteness in different white things, without enabling them to pay attention to whiteness in the abstract.

An interesting question that arises in this context is the question whether having an idea of the future or thinking about a future state requires abstraction. One reason to think so is that, plausibly, in order to think about the future, for instance about future pleasures or pains, one needs to abstract from the present pleasures or pains that one can directly experience, or from past pleasures and pains that one remembers. After all, just as one can only attain the concept of whiteness by abstracting from other properties of the particular white things one has experienced, so, arguably, one can only acquire the idea of future pleasures through abstraction from particular present pleasures. It may be for this reason that Leibniz sometimes notes that animals have “neither foresight nor anxiety for the future” (Huggard, p. 414). Apparently, he does not consider animals capable of having an idea of the future or of future states.

Leibniz thinks that in addition to sensible concepts such as whiteness, we also have concepts that are not derived from the senses, that is, we possess intellectual concepts. The latter, it seems, set us apart even farther from animals because we attain them through reflective self-awareness, of which animals, as seen above, are not capable. Leibniz says, for instance, that “being is innate in us—the knowledge of being is comprised in the knowledge that we have of ourselves. Something like this holds of other general notions” (New Essays, p. 102). Similarly, he states a few pages later that “reflection enables us to find the idea of substance within ourselves, who are substances” (New Essays, p. 105). Many similar statements can be found elsewhere. The intellectual concepts that we can discover in our souls, according to Leibniz, include not only being and substance, but also unity, similarity, sameness, pleasure, cause, perception, action, duration, doubting, willing, and reasoning, to name only a few. In order to derive these concepts from our reflective self-awareness, we must apparently engage in abstraction: I am distinctly aware of myself as an agent, a substance, and a perceiver, for instance, and from this awareness I can abstract the ideas of action, substance, and perception in general. This means that animals are inferior to us among other things in the following two ways: they cannot have distinct self-awareness, and they cannot abstract. They would need both of these capacities in order to form intellectual concepts, and they would need the latter—that is, abstraction—in order to form sensible concepts.

Intellectual concepts are not the only things that minds can find in themselves: in addition, they are also able to discover eternal or general truths there, such as the axioms or principles of logic, metaphysics, ethics, and natural theology. Like the intellectual concepts just mentioned, these general truths or principles cannot be derived from the senses and can thus be classified as innate ideas. Leibniz says, for instance,

Above all, we find [in this I and in the understanding] the force of the conclusions of reasoning, which are part of what is called the natural light. … It is also by this natural light that the axioms of mathematics are recognized. … [I]t is generally true that we know [necessary truths] only by this natural light, and not at all by the experiences of the senses. (Ariew and Garber, p. 189)

Axioms and general principles, according to this passage, must come from the mind itself and cannot be acquired through sense experience. Yet, also as in the case of intellectual concepts, it is not easy for us to discover such general truths or principles in ourselves; instead, it takes effort or special attention. It again appears to require the kind of attention to what is within us of which animals are not capable. Because they lack this type of reflection, animals are “governed purely by examples from the senses” and “consequently can never arrive at necessary and general truths” (Strickland p. 84).

b. Appetitions

Monads possess not only perceptions, or representations of the world they inhabit, but also appetitions. These appetitions are the tendencies or inclinations of these monads to act, that is, to transition from one mental state to another. The most familiar examples of appetitions are conscious desires, such as my desire to have a drink of water. Having this desire means that I have some tendency to drink from the glass of water in front of me. If the desire is strong enough, and if there are no contrary tendencies or desires in my mind that are stronger—for instance, the desire to win the bet that I can refrain from drinking water for one hour—I will attempt to drink the water. This desire for water is one example of a Leibnizian appetition. Yet, just as in the case of perceptions, only a very small portion of appetitions is conscious. We are unaware of most of the tendencies that lead to changes in our perceptions. For instance, I am aware neither of perceiving my hair growing, nor of my tendencies to have those perceptions. Moreover, as in the case of perceptions, there are an infinite number of appetitions in any monad at any given time. This is because, as seen, each monad represents the entire universe. As a result, each monad constantly transitions from one infinitely complex perceptual state to another, reflecting all changes that take place in the universe. The tendency that leads to a monad’s transition from one of these infinitely complex perceptual states to another is therefore also infinitely complex, or composed of infinitely many smaller appetitions.

The three types of monads—bare monads, souls, and minds—differ not only with respect to their perceptual or cognitive capacities, but also with respect to their appetitive capacities. In fact, there are good reasons to think that three different types of appetitions correspond to the three types of perceptions mentioned above, that is, to perception, sensation, and rational perception. After all, Leibniz distinguishes between appetitions of which we can be aware and those of which we cannot be aware, which he sometimes also calls ‘insensible appetitions’ or ‘insensible inclinations.’ He appears to further divide the type of which we can be aware into rational and non-rational appetitions. This threefold division is made explicit in a passage from the New Essays:

There are insensible inclinations of which we are not aware. There are sensible ones: we are acquainted with their existence and their objects, but have no sense of how they are constituted. … Finally there are distinct inclinations which reason gives us: we have a sense both of their strength and of their constitution. (p. 194)

According to this passage, then, Leibniz acknowledges the following three types of appetitions: (a) insensible or unconscious appetitions, (b) sensible or conscious appetitions, and (c) distinct or rational appetitions.

Even though Leibniz does not say so explicitly, he furthermore believes that bare monads have only unconscious appetitions, that animal souls additionally have conscious appetitions, and that only minds have distinct or rational appetitions. Unconscious appetitions are tendencies such as the one that leads to my perception of my hair growing, or the one that prompts me unexpectedly to perceive the sound of my alarm in the morning. All appetitions in bare monads are of this type; they are not aware of any of their tendencies. An example of a sensible appetition, on the other hand, is an appetition for pleasure. My desire for a piece of chocolate, for instance, is such an appetition: I am aware that I have this desire and I know what the object of the desire is, but I do not fully understand why I have it. Animals are capable of this kind of appetition; in fact, many of their actions are motivated by their appetitions for pleasure. Finally, an example of a rational appetition is the appetition for something that my intellect has judged to be the best course of action. Leibniz appears to identify the capacity for this kind of appetition with the will, which, as we will see below, plays a crucial role in Leibniz’s theory of freedom. What is distinctive of this kind of appetition is that whenever we possess it, we are not only aware of it and of its object, but also understand why we have it. For instance, if I judge that I ought to call my mother and consequently desire to call her, Leibniz thinks, I am aware of the thought process that led me to make this judgment, and hence of the origins of my desire.

Another type of rational appetition is the type of appetition involved in reasoning. As seen, Leibniz thinks that animals, because they can remember prior perceptions, are able to learn from experience, like the dog that learns to run away from sticks. This sort of behavior, which involves a kind of inductive inference (see Deductive and Inductive Arguments), can be called a “shadow of reasoning,” Leibniz tells us (New Essays, p. 50). Yet, animals are incapable of true—that is, presumably, deductive—reasoning, which, Leibniz tells us, “depends on necessary or eternal truths, such as those of logic, numbers, and geometry, which bring about an indubitable connection of ideas and infallible consequences” (Principles of Nature and Grace, section 5, in Ariew and Garber, 1989). Only minds can reason in this stricter sense.

Some interpreters think that reasoning consists simply in very distinct perception. Yet that cannot be the whole story. First of all, reasoning must involve a special type of perception that differs from the perceptions of lower animals in kind, rather than merely in degree, namely abstract thought and the perception of eternal truths. This kind of perception is not just more distinct; it has entirely different objects than the perceptions of non-rational souls, as we saw above. Moreover, it seems more accurate to describe reasoning as a special kind of appetition or tendency than as a special kind of perception. This is because reasoning is not just one perception, but rather a series of perceptions. Leibniz for instance calls it “a chain of truths” (New Essays, p. 199) and defines it as “the linking together of truths” (Huggard, p. 73). Thus, reasoning is not the same as perceiving a certain type of object, nor as perceiving an object in a particular fashion. Rather, it consists mainly in special types of transitions between perceptions and therefore, according to Leibniz’s account of how monads transition from perception to perception, in appetitions for these transitions. What a mind needs in order to be rational, therefore, are appetitions that one could call the principles of reasoning. These appetitions or principles allow minds to transition, for instance, from the premises of an argument to its conclusion. In order to conclude ‘Socrates is mortal’ from ‘All men are mortal’ and ‘Socrates is a man,’ for example, I not only need to perceive the premises distinctly, but I also need an appetition for transitioning from premises of a particular form to conclusions of a particular form.

Leibniz states in several texts that our reasonings are based on two fundamental principles: the Principle of Contradiction and the Principle of Sufficient Reason. Human beings also have access to several additional innate truths and principles, for instance those of logic, mathematics, ethics, and theology. In virtue of these principles we have a priori knowledge of necessary connections between things, while animals can only have empirical knowledge of contingent, or merely apparent, connections. The perceptions of animals, then, are not governed by the principles on which our reasonings are based; the closest an animal can come to reasoning is, as mentioned, engaging in empirical inference or induction, which is based not on principles of reasoning, but merely on the recognition and memory of regularities in previous experience. This confirms that reasoning is a type of appetition: using, or being able to use, principles of reasoning cannot just be a matter of perceiving the world more distinctly. In fact, these principles are not something that we acquire or derive from perceptions. Instead, at least the most basic ones are innate dispositions for making certain kinds of transitions.

In connection with reasoning, it is important to note that even though Leibniz sometimes uses the term ‘thought’ for perceptions generally, he makes it clear in some texts that it strictly speaking belongs exclusively to minds because it is “perception joined with reason” (Strickland p. 66; see also New Essays, p. 210). This means that the ability to think in this sense, just like reasoning, is also something that is exclusive to minds, that is, something that distinguishes minds from animal souls. Non-rational souls neither reason nor think, strictly speaking; they do however have perceptions.

The distinctive cognitive and appetitive capacities of the three types of monads are summarized in the following table:

2. Freedom

One final capacity that sets human beings apart from non-rational animals is the capacity for acting freely. This is mainly because Leibniz closely connects free agency with rationality: acting freely requires acting in accordance with one’s rational assessment of which course of action is best. Hence, acting freely involves rational perceptions as well as rational appetitions. It requires both knowledge of, or rational judgments about, the good, as well as the tendency to act in accordance with these judgments. For Leibniz, the capacity for rational judgments is called ‘intellect,’ and the tendency to pursue what the intellect judges to be best is called ‘will.’ Non-human animals, because they do not possess intellects and wills, or the requisite type of perceptions and appetitions, lack freedom. This also means, however, that most human actions are not free, because we only sometimes reason about the best course of action and act voluntarily, on the basis of our rational judgments. Leibniz in fact stresses that in three quarters of their actions, human beings act just like animals, that is, without making use of their rationality (see Principles of Nature and Grace, section 5, in Ariew and Garber, 1989).

In addition to rationality, Leibniz claims, free actions must be self-determined and contingent (see e.g. Theodicy, section 288). An action is self-determined—or spontaneous, as Leibniz often calls it—when its source is in the agent, rather than in another agent or some other external entity. While all actions of monads are spontaneous in a general sense since, as we will see in section four, Leibniz denies all interaction among created substances, he may have a more demanding notion of spontaneity in mind when he calls it a requirement for freedom. After all, when an agent acts on the basis of her rational judgment, she is not even subject to the kind of apparent influence of her body or of other creatures that is present, for instance, when someone pinches her and she feels pain.

In order to be contingent, on the other hand, the action cannot be the result of compulsion or necessitation. This, again, is generally true for all actions of monads because Leibniz holds that all changes in the states of a creature are contingent. Yet, there may again be an especially demanding sense in which free actions are contingent for Leibniz. He often says that when a rational agent does something because she believes it to be best, the goodness she perceives, or her motives for acting, merely incline her towards action without necessitating action (see e.g. Huggard, p. 419; Fifth Letter to Clarke, sections 8-9; Ariew and Garber, p. 195; New Essays, p. 175). Hence, Leibniz may be attributing a particular kind of contingency to free actions.

Even though Leibniz holds that free actions must be contingent, that is, that they cannot be necessary, he grants that they can be determined. In fact, Leibniz vehemently rejects the notion that a world with free agents must contain genuine indeterminacy. Hence, Leibniz is what we today call a compatibilist about freedom and determinism (see Free Will). He believes that all actions, whether they are free or not, are determined by the nature and the prior states of the agent. What is special about free actions, then, is not that they are undetermined, but rather that they are determined, among other things, by rational perceptions of the good. We always do what we are most strongly inclined to do, for Leibniz, and if we are most strongly inclined by our judgment about the best course of action, we pursue that course of action freely. The ability to act contrary even to one’s best reasons or motives, Leibniz contends, is not required for freedom, nor would it be worth having. As Leibniz puts it in the New Essays, “the freedom to will contrary to all the impressions which may come from the understanding … would destroy true liberty, and reason with it, and would bring us down below the beasts” (p. 180). In fact, being determined by our rational understanding of the good, as we are in our free actions, makes us godlike, because according to Leibniz, God is similarly determined by what he judges to be best. Nothing could be more perfect and more desirable than acting in this way.

3. The Mill Argument

In several of his writings, Leibniz argues that purely material things such as brains or machines cannot possibly think or perceive. Hence, Leibniz contends that materialists like Thomas Hobbes are wrong to think that they can explain mentality in terms of the brain. This argument is without question among Leibniz’s most influential contributions to the philosophy of mind. It is relevant not only to the question whether human minds might be purely material, but also to the question whether artificial intelligence is possible. Because Leibniz’s argument against perception in material objects often employs a thought experiment involving a mill, interpreters refer to it as ‘the mill argument.’ There is considerable disagreement among recent scholars about the correct interpretation of this argument (see References and Further Reading). The present section sketches one plausible way of interpreting Leibniz’s mill argument.

The most famous version of Leibniz’s mill argument occurs in section 17 of the Monadology:

Moreover, we must confess that perception, and what depends on it, is inexplicable in terms of mechanical reasons, that is, through shapes and motions. If we imagine that there is a machine whose structure makes it think, sense, and have perceptions, we could conceive it enlarged, keeping the same proportions, so that we could enter into it, as one enters into a mill. Assuming that, when inspecting its interior, we will only find parts that push one another, and we will never find anything to explain a perception. And so, we should seek perception in the simple substance and not in the composite or in the machine.

To understand this argument, it is important to recall that Leibniz, like many of his contemporaries, views all material things as infinitely divisible. As already seen, he holds that there are no smallest or most fundamental material elements, and every material thing, no matter how small, has parts and is hence complex. Even if there were physical atoms—against which Leibniz thinks he has conclusive metaphysical arguments—they would still have to be extended, like all matter, and we would hence be able to distinguish between an atom’s left half and its right half. The only truly simple things that exist are monads, that is, unextended, immaterial, mind-like things. Based on this understanding of material objects, Leibniz argues in the mill passage that only immaterial entities are capable of perception because it is impossible to explain perception mechanically, or in terms of material parts pushing one another.

Unfortunately Leibniz does not say explicitly why exactly he thinks there cannot be a mechanical explanation of perception. Yet it becomes clear in other passages that for Leibniz perceiving has to take place in a simple thing. This assumption, in turn, straightforwardly implies that matter—which as seen is complex—is incapable of perception. This, most likely, is behind Leibniz’s mill argument. Why does Leibniz claim that perception can only take place in simple things? If he did not have good reasons for this claim, after all, it would not constitute a convincing starting point for his mill argument.

Leibniz’s reasoning appears to be the following. Material things, such as mirrors or paintings, can represent complexity. When I stand in front of a mirror, for instance, the mirror represents my body. This is an example of the representation of one complex material thing in another complex material thing. Yet, Leibniz argues, we do not call such a representation ‘perception’: the mirror does not “perceive” my body. The reason this representation falls short of perception, Leibniz contends, is that it lacks the unity that is characteristic of perceptions: the top part of the mirror represents the top part of my body, and so on. The representation of my body in the mirror is merely a collection of smaller representations, without any genuine unity. When another person perceives my body, on the other hand, her representation of my body is a unified whole. No physical thing can do better than the mirror in this respect: the only way material things can represent anything is through the arrangement or properties of their parts. As a result, any such representation will be spread out over multiple parts of the representing material object and hence lack genuine unity. It is arguably for this reason that Leibniz defines ‘perception’ as “the passing state which involves and represents a multitude in the unity or in the simple substance” (Monadology, section 14).

Leibniz’s mill argument, then, relies on a particular understanding of perception and of material objects. Because all material objects are complex and because perceptions require unity, material objects cannot possibly perceive. Any representation a machine, or a material object, could produce would lack the unity required for perception. The mill example is supposed to illustrate this: even an extremely small machine, if it is purely material, works only in virtue of the arrangement of its parts. Hence, it is always possible, at least in principle, to enlarge the machine. When we imagine the machine thus enlarged, that is, when we imagine being able to distinguish the machine’s parts as we can distinguish the parts of a mill, we will realize that the machine cannot possibly have genuine perceptions.

Yet the basic idea behind Leibniz’s mill argument can be appealing even to those of us who do not share Leibniz’s assumptions about perception and material objects. In fact, it appears to be a more general version of what is today called “the hard problem of consciousness,” that is, the problem of explaining how something physical could explain, or give rise to, consciousness. While Leibniz’s mill argument is about perception generally, rather than conscious perception in particular, the underlying structure of the argument appears to be similar: mental states have characteristics—such as their unity or their phenomenal properties—that, it seems, cannot even in principle be explained physically. There is an explanatory gap between the physical and the mental.

4. The Relation between Mind and Body

The mind-body problem is a central issue in the philosophy of mind. It is, roughly, the problem of explaining how mind and body can causally interact. That they interact seems exceedingly obvious: my mental states, such as for instance my desire for a cold drink, do seem capable of producing changes in my body, such as the bodily motions required for walking to the fridge and retrieving a bottle of water. Likewise, certain physical states seem capable of producing changes in my mind: when I stub my toe on my way to the fridge, for instance, this event in my body appears to cause me pain, which is a mental state. For Descartes and his followers, it is notoriously difficult to explain how mind and body causally interact. After all, Cartesians are substance dualists: they believe that mind and body are substances of a radically different type (see Descartes: Mind-Body Distinction). How could a mental state such as a desire cause a physical state such as a bodily motion, or vice versa, if mind and body have absolutely nothing in common? This is the version of the mind-body problem that Cartesians face.

For Leibniz, the mind-body problem does not arise in exactly the way it arises for Descartes and his followers, because Leibniz is not a substance dualist. We have already seen that, according to Leibniz, an animal or human being has a central monad, which constitutes its soul, as well as subordinate monads that are everywhere in its body. In fact, Leibniz appears to hold that the body just is the collection of these subordinate monads and their perceptions (see e.g. Principles of Nature and Grace section 3), or that bodies result from monads (Ariew and Garber, p. 179). After all, as already seen, he holds that purely material, extended things would not only be incapable of perception, but would also not be real because of their infinite divisibility. The only truly real things, for Leibniz, are monads, that is, immaterial and indivisible substances. This means that Leibniz, unlike Descartes, does not believe that there are two fundamentally different kinds of substances, namely physical and mental substances. Instead, for Leibniz, all substances are of the same general type. As a result, the mind-body problem may seem more tractable for Leibniz: if bodies have a semi-mental nature, there are fewer obvious obstacles to claiming that bodies and minds can interact with one another.

Yet, for complicated reasons that are beyond the scope of this article (but see Leibniz: Causation), Leibniz held that human minds and their bodies—as well as any created substances, in fact—cannot causally interact. In this, he agrees with occasionalists such as Nicolas Malebranche. Leibniz departs from occasionalists, however, in his positive account of the relation between mental and corresponding bodily events. Occasionalists hold that God needs to intervene in nature constantly to establish this correspondence. When I decide to move my foot, for instance, God intervenes and moves my foot accordingly, occasioned by my decision. Leibniz, however, thinks that such interventions would constitute perpetual miracles and be unworthy of a God who always acts in the most perfect manner. God arranged things so perfectly, Leibniz contends, that there is no need for these divine interventions. Even though he endorses the traditional theological doctrine that God continually conserves all creatures in existence and concurs with their actions (see Leibniz: Causation), Leibniz stresses that all natural events in the created world are caused and made intelligible by the natures of created things. In other words, Leibniz rejects the occasionalist doctrine that God is the only active, efficient cause, and that the laws of nature that govern natural events are merely God’s intentions to move his creatures around in a particular way. Instead for Leibniz these laws, or God’s decrees about the ways in which created things should behave, are written into the natures of these creatures. God not only decided how creatures should act, but also gave them natures and natural powers from which these actions follow. To understand the regularities and events in nature, we do not need to look beyond the natures of creatures. This, Leibniz claims, is much more worthy of a perfect God than the occasionalist world, in which natural events are not internally intelligible.

How, then, does Leibniz explain the correspondence between mental and bodily states if he denies that there is genuine causal interaction among finite things and also denies that God brings about the correspondence by constantly intervening? Consider again the example in which I decide to get a drink from the fridge and my body executes that decision. It may seem that unless there is a fairly direct link between my decision and the action—either a link supplied by God’s intervention, or by the power of my mind to cause bodily motion—it would be an enormous coincidence that my body carries out my decision. Yet, Leibniz thinks there is a third option, which he calls ‘pre-established harmony.’ On this view, God created my body and my mind in such a way that they naturally, but without any direct causal links, correspond to one another. God knew, before he created my body, that I would decide to get a cold drink, and hence made my body in such a way that it will, in virtue of its own nature, walk to the fridge and get a bottle of water right after my mind makes that decision.

In one text, Leibniz provides a helpful analogy for his doctrine of pre-established harmony. Imagine two pendulum clocks that are in perfect agreement for a long period of time. There are three ways to ensure this kind of correspondence between them: (a) establishing a causal link, such as a connection between the pendulums of these clocks, (b) asking a person constantly to synchronize the two clocks, and (c) designing and constructing these clocks so perfectly that they will remain perfectly synchronized without any causal links or adjustments (see Ariew and Garber, pp. 147-148). Option (c), Leibniz contends, is superior to the other two options, and it is in this way that God ensures that the states of my mind correspond to the states of my body, or in fact, that the perceptions of any created substance harmonize with the perceptions of any other. The world is arranged and designed so perfectly that events in one substance correspond to events in another substance even though they do not causally interact, and even though God does not intervene to adjust one to the other. Because of his infinite wisdom and foreknowledge, God was able to pre-establish this mutual correspondence or harmony when he created the world, analogously to the way a skilled clockmaker can construct two clocks that perfectly correspond to one another for a period of time.

5. References and Further Reading

a. Primary Sources in English Translation

Ariew, Roger and Daniel Garber, eds. Philosophical Essays. Indianapolis: Hackett, 1989.
- Contains translations of many of Leibniz’s most important shorter writings such as the Monadology, the Principles of Nature and Grace, the Discourse on Metaphysics, and excerpts from Leibniz’s correspondence, to name just a few.
Ariew, Roger, ed. Correspondence [between Leibniz and Clarke]. Indianapolis: Hackett, 2000.
- A translation of Leibniz’s correspondence with Samuel Clarke, which touches on many important topics in metaphysics and philosophy of mind.
Francks, Richard and Roger S. Woolhouse, eds. Leibniz’s ‘New System’ and Associated Contemporary Texts. Oxford: Oxford University Press, 1997.
- Contains English translations of additional short texts.
Francks, Richard and Roger S. Woolhouse, eds. Philosophical Texts. Oxford: Oxford University Press, 1998.
- Contains English translations of additional short texts.
Huggard, E. M., ed. Theodicy: Essays on the Goodness of God, the Freedom of Man and the Origin of Evil. La Salle: Open Court, 1985.
- Translation of the only philosophical monograph Leibniz published in his lifetime, which contains many important discussions of free will.
Lodge, Paul, ed. The Leibniz–De Volder Correspondence: With Selections from the Correspondence between Leibniz and Johann Bernoulli. New Haven: Yale University Press, 2013.
- An edition, with English translations, of Leibniz’s correspondence with De Volder, which is a very important source of information about Leibniz’s mature metaphysics.
Loemker, Leroy E., ed. Philosophical Papers and Letters. Dordrecht: D. Reidel, 1970.
- Contains English translations of additional short texts.
Look, Brandon and Donald Rutherford, eds. The Leibniz–Des Bosses Correspondence. New Haven: Yale University Press, 2007.
- An edition, with English translations, of Leibniz’s correspondence with Des Bosses, which is another important source of information about Leibniz’s mature metaphysics.
Parkinson, George Henry Radcliffe and Mary Morris, eds. Philosophical Writings. London: Everyman, 1973.
- Contains English translations of additional short texts.
Remnant, Peter and Jonathan Francis Bennett, eds. New Essays on Human Understanding. Cambridge: Cambridge University Press, 1996.
- Translation of Leibniz’s section-by-section response to Locke’s Essay Concerning Human Understanding, written in the form of a dialogue between the two fictional characters Philalethes and Theophilus, who represent Locke’s and Leibniz’s views, respectively.
Rescher, Nicholas, ed. G.W. Leibniz’s Monadology: An Edition for Students. Pittsburgh: University of Pittsburgh Press, 1991.
- An edition, with English translation, of the Monadology, with commentary and a useful collection of parallel passages from other Leibniz texts.
Strickland, Lloyd H., ed. The Shorter Leibniz Texts: A Collection of New Translations. London: Continuum, 2006.
- Contains English translations of additional short texts.

b. Secondary Sources

Adams, Robert Merrihew. Leibniz: Determinist, Theist, Idealist. New York: Oxford University Press, 1994.
- One of the most influential and rigorous works on Leibniz’s metaphysics.
Borst, Clive. “Leibniz and the Compatibilist Account of Free Will.” Studia Leibnitiana 24.1 (1992): 49-58.
- About Leibniz’s views on free will.
Brandom, Robert. “Leibniz and Degrees of Perception.” Journal of the History of Philosophy 19 (1981): 447-79.
- About Leibniz’s views on perception and perceptual distinctness.
Davidson, Jack. “Imitators of God: Leibniz on Human Freedom.” Journal of the History of Philosophy 36.3 (1998): 387-412.
- Another helpful article about Leibniz’s views on free will and on the ways in which human freedom resembles divine freedom.
Davidson, Jack. “Leibniz on Free Will.” The Continuum Companion to Leibniz. Ed. Brandon Look. London: Continuum, 2011. 208-222.
- Accessible general introduction to Leibniz’s views on freedom of the will.
Duncan, Stewart. “Leibniz’s Mill Argument Against Materialism.” Philosophical Quarterly 62.247 (2011): 250-72.
- Helpful discussion of Leibniz’s mill argument.
Garber, Daniel. Leibniz: Body, Substance, Monad. New York: Oxford University Press, 2009.
- A thorough study of the development of Leibniz’s metaphysical views.
Gennaro, Rocco J. “Leibniz on Consciousness and Self-Consciousness.” New Essays on the Rationalists. Eds. Rocco J. Gennaro and C. Huenemann. Oxford: Oxford University Press, 1999. 353-371.
- Discusses Leibniz’s views on consciousness and highlights the advantages of reading Leibniz as endorsing a higher-order thought theory of consciousness.
Jolley, Nicholas. Leibniz. London; New York: Routledge, 2005.
- Good general introduction to Leibniz’s philosophy; includes chapters on the mind and freedom.
Jorgensen, Larry M. “Leibniz on Memory and Consciousness.” British Journal for the History of Philosophy 19.5 (2011a): 887-916.
- Elaborates on Jorgensen (2009) and discusses the role of memory in Leibniz’s theory of consciousness.
Jorgensen, Larry M. “Mind the Gap: Reflection and Consciousness in Leibniz.” Studia Leibnitiana 43.2 (2011b): 179-95.
- About Leibniz’s account of reflection and reasoning.
Jorgensen, Larry M. “The Principle of Continuity and Leibniz’s Theory of Consciousness.” Journal of the History of Philosophy 47.2 (2009): 223-48.
- Argues against ascribing a higher-order theory of consciousness to Leibniz.
Kulstad, Mark. Leibniz on Apperception, Consciousness, and Reflection. Munich: Philosophia, 1991.
- Influential, meticulous study of Leibniz’s views on consciousness.
Kulstad, Mark. “Leibniz, Animals, and Apperception.” Studia Leibnitiana 13 (1981): 25-60.
- A shorter discussion of some of the issues in Kulstad (1991).
Lodge, Paul, and Marc E. Bobro. “Stepping Back Inside Leibniz’s Mill.” The Monist 81.4 (1998): 553-72.
- Discusses Leibniz’s mill argument.
Lodge, Paul. “Leibniz’s Mill Argument Against Mechanical Materialism Revisited.” Ergo (2014).
- Further discussion of Leibniz’s mill argument.
McRae, Robert. Leibniz: Perception, Apperception, and Thought. Toronto: University of Toronto Press, 1976.
- An important and still helpful, even if somewhat dated, study of Leibniz’s philosophy of mind.
Murray, Michael J. “Spontaneity and Freedom in Leibniz.” Leibniz: Nature and Freedom. Eds. Donald Rutherford and Jan A. Cover. Oxford: Oxford University Press, 2005. 194-216.
- Discusses Leibniz’s views on free will and self-determination, or spontaneity.
Phemister, Pauline. “Leibniz, Freedom of Will and Rationality.” Studia Leibnitiana 26.1 (1991): 25-39.
- Explores the connections between rationality and freedom in Leibniz.
Rozemond, Marleen. “Leibniz on the Union of Body and Soul.” Archiv für Geschichte der Philosophie 79.2 (1997): 150-78.
- About the mind-body problem and pre-established harmony in Leibniz.
Rozemond, Marleen. “Mills Can’t Think: Leibniz’s Approach to the Mind-Body Problem.” Res Philosophica 91.1 (2014): 1-28.
- Another helpful discussion of the mill argument.
Savile, Anthony. Routledge Philosophy Guidebook to Leibniz and the Monadology. New York: Routledge, 2000.
- Very accessible introduction to Leibniz’s Monadology.
Simmons, Alison. “Changing the Cartesian Mind: Leibniz on Sensation, Representation and Consciousness.” The Philosophical Review 110.1 (2001): 31-75.
- Insightful discussion of the ways in which Leibniz’s philosophy of mind differs from the Cartesian view; also argues that Leibnizian consciousness consists in higher-order perceptions.
Sotnak, Eric. “The Range of Leibnizian Compatibilism.” New Essays on the Rationalists. Eds. Rocco J. Gennaro and C. Huenemann. Oxford: Oxford University Press, 1999. 200-223.
- About Leibniz’s theory of freedom.
Swoyer, Chris. “Leibnizian Expression.” Journal of the History of Philosophy 33 (1995): 65-99.
- About Leibnizian perception.
Wilson, Margaret Dauler. “Confused Vs. Distinct Perception in Leibniz: Consciousness, Representation, and God’s Mind.” Ideas and Mechanism: Essays on Early Modern Philosophy. Princeton: Princeton University Press, 1999. 336-352.
- About Leibnizian perception as well as perceptual distinctness.

Author Information

Julia Jorati
Email: jorati.1@osu.edu
The Ohio State University
U. S. A.

Kwasi Wiredu (1931—2022)

Kwasi Wiredu was a philosopher from Ghana, who had for decades been involved with a project he termed “conceptual decolonization” in contemporary African systems of thought. By conceptual decolonization, Wiredu advocated a re-examination of current African epistemic formations in order to accomplish two aims. First, he wished to subvert unsavory aspects of tribal culture embedded in modern African thought so as to make that thought more viable. Second, he intended to dislodge unnecessary Western epistemologies that are to be found in African philosophical practices.

In previously colonized regions of the world, decolonization has remained a topical issue both at the highest theoretical levels and also at the basic level of everyday existence. After African countries attained political liberation, decolonization became an immediate and overwhelming preoccupation. A broad spectrum of academic disciplines took up the conceptual challenges of decolonization in a variety of ways. The disciplines of anthropology, history, political science, literature, and philosophy all grappled with the practical and academic conundrums of decolonization.

A central purpose in this article is to examine the contributions and limitations of African philosophy in relation to the history of the debate on decolonization. In this light, it sometimes appears that African philosophy has been quite limited in defining the horizons of the debate when compared with the achievements of academic specialties such as literature and cultural studies. Thus, decolonization has been rightly conceived as a vast, global, and trans-disciplinary enterprise.

This analysis involves an examination of both the limitations and immense possibilities of Wiredu’s theory of conceptual decolonization. First, the article offers a close reading of the theory itself and then locates it within the broader movement of modern African thought. In several instances, Wiredu’s theory has proved seminal to the advancement of contemporary African philosophical practices. It is also necessary to be aware of current imperatives of globalization, nationality, and territoriality and how they affect the agency of a theory such as ideological/conceptual decolonization. Indeed, the notion of decolonization is far more complex than is often assumed. Consequently, the epistemological resources by which it can be apprehended as a concept, ideology, or process are multiple and diverse. Lastly, this article, as a whole, represents a reflection on the diversity of the dimensions of decolonization.

Introduction
Early Beginnings
Decolonization as Epistemological Practice
Tradition, Modernity and the Challenges of Development
An African Reading of Karl Marx
Conclusion
References and Further Reading

1. Introduction

As one of Africa’s foremost philosophers, Kwasi Wiredu has done a great deal to establish the discipline of philosophy, in its contemporary shape, as a credible area of intellection in most parts of the African continent and beyond. In order to appreciate the conceptual and historical contexts of his work, it is necessary to possess some familiarity with relevant discourses in African studies and history, anthropology, literature and postcolonial theory, particularly those advanced by Edward W. Said, Gayatri Spivak, Homi Bhabha, Abiola Irele and Biodun Jeyifo. Wiredu’s contribution to the making of modern African thought provides an interesting insight into the processes involved in the formation of postcolonial disciplines and discourses, and it can also be conceived as a counter-articulation to the hegemonic discourses of imperial domination.

Wiredu, for many decades, was involved with a project he termed conceptual decolonization in contemporary African systems of thought. This term entailed, for Wiredu, a re-examination of current African epistemic foundations in order to accomplish two main objectives. First, he intended to undermine counter-productive facets of tribal cultures embedded in modern African, thought so as to make this body of thought both more sustainable and more rational. Second, he intended to deconstruct the unnecessary Western epistemologies which may be found in African philosophical practices.

A broad spectrum of academic disciplines took up the conceptual challenges of decolonization in a variety of ways. In particular, the disciplines of anthropology, history, political science, literature and philosophy all grappled with the practical and academic challenges inherent to decolonization.

It is usually profitable to examine the contributions and limitations of African philosophers comparatively (along with other African thinkers who are not professional philosophers) in relation to the history of the debate on decolonization. In addition to the scholars noted above, the discourse of decolonialization has benefitted from many valuable contributions made by intellectuals such as Frantz Fanon, Leopold Sedar Senghor, Cheikh Anta Diop, and Ngugi wa Thiongo. In this light, it would appear that African philosophy has been, at certain moments, limited in defining the horizons of the debate when compared with the achievements of academic specialties such as literature, postcolonial theory and cultural studies. Thus, decolonization, as Ngugi wa Thiongo, the Kenyan cultural theorist and novelist, notes, must be conceived as a broad, transcontinental, and multidisciplinary venture.

Within the Anglophone contingent of African philosophy, the analytic tradition of British philosophy continues to be dominant. This discursive hegemony had led an evident degree of parochialism, according to Wiredu. This in turn has led to the neglect of many other important intellectual traditions. For instance, within this Anglophonic sphere, there is not always a systematic interrogation of the limits, excesses and uses of colonialist anthropology in formulating the problematic of identity. In this regard, the problematic of identity does not only refer to the question of personal agency but more broadly, the challenges of discursive identity. This shortcoming is not as evident in Francophone traditions of African philosophy, which usually highlight the foundational discursive interactions between anthropology and modern African thought. Thus, in this instance, there is an opening to other discursive formations necessary for the nurturing a vibrant philosophical practice. Also, within Anglophone African philosophy, a stringent critique of imperialism and contemporary globalization does not always figure is not always significantly in the substance of the discourse, thereby further underlining the drawbacks of parochialism. As such, it is necessary for critiques of Wiredu’s corpus to move beyond its ostensible frame to include critiques and discussions of traditions of philosophical practice outside the Anglophone divide of modern African thought (Osha, 2005). Accordingly, such critiques ought not merely be a celebration of post-structuralist discourses to the detriment of African intellectual traditions. Instead, they should be, among other things, an exploration of the discursive intimacies between the Anglophone and Francophone traditions of African philosophy. In addition, an interrogation of other borders of philosophy is required to observe the gains that might accrue to the Anglophone movement of contemporary African philosophy, which, in many ways, has reached a discursive dead-end due to its inability to surmount the intractable problematic of identity, and its endless preoccupation with the question of its origins. These are the sort of interrogations that readings of Wiredu’s work necessitate. Furthermore, a study of Wiredu’s corpus (Osha, 2005) identifies—if only obliquely—the necessity to re-assess the importance of other discourses such as colonialist anthropology and various philosophies of black subjectivity in the formation of the modern African subject. These are some of the central concerns which appear in Kwasi Wiredu and Beyond: The Text, Writing and Thought in Africa (2005).

2. Early Beginnings

Kwasi Wiredu was born in 1931 in Ghana and had his first exposure to philosophy quite early in life. He read his first couple of books of philosophy in school around 1947 in Kumasi, the capital of Ashanti. These books were Bernard Bosanquet’s The Essentials of Logic and C.E.M. Joad’s Teach Yourself Philosophy. Logic, as a branch of philosophy attracted Wiredu because of its affinities to grammar, which he enjoyed. He was also fond of practical psychology during the formative years of his life. In 1950, whilst vacationing with his aunt in Accra, the capital of Ghana, he came across another philosophical text which influenced him tremendously. The text was The Last Days of Socrates which had the following four dialogues by Plato: The Apology, Euthyphro, Meno and Crito. These dialogues were to influence, in a significant way, the final chapter of his first groundbreaking philosophical text, Philosophy and an African Culture (1980) which is also dialogic in structure.

He was admitted into the University of Ghana, Legon in 1952, to read philosophy, but before attending he started to study the thought of John Dewey on his own. However, mention must be made of the fact that C. E. M. Joad’s philosophy had a particularly powerful effect on him. Indeed, he employed the name J. E. Joad as his pen-name for a series of political articles he wrote for a national newspaper, The Ashanti Sentinel between 1950 and1951. At the University of Ghana, he was instructed mainly in Western philosophy and he came to find out about African traditions of thought more or less through his own individual efforts. He was later to admit that the character of his undergraduate education was to leave his mind a virtual tabula rasa, as far as African philosophy was concerned. In other words, he had to develop and maintain his interests in African philosophy on his own. One of the first texts of African philosophy that he read was J. B. Danquah’s Akan Doctrine of God: A Fragment of Gold Coast Ethics and Religion. Undoubtedly, his best friend William Abraham, who went a year before him to Oxford University, must have also influenced the direction of his philosophical research towards African thought. A passage from an interview explains the issue of his institutional relation to African philosophy:

Prior to 1985, when I was in Africa, I devoted most of my time in almost equal proportions to research in African philosophy and in other areas of philosophy, such as the philosophy of logic, in which not much has, or is generally known to have, been done in African philosophy. I did not have always to be teaching African philosophy or giving public lectures in African philosophy. There were others who were also competent to teach the subject and give talks in our Department of Philosophy. But since I came to the United States, I have often been called upon to teach or talk about African philosophy. I have therefore spent much more time than before researching in that area. This does not mean that I have altogether ignored my earlier interests, for indeed, I continue to teach subjects like (Western) logic and epistemology (Wiredu in Oladiop 2002: 332).

Wiredu began publishing relatively late, but has been exceedingly prolific ever since he started. During the early to mid 1970s, he often published as many as six major papers per year on topics ranging from logic, to epistemology, to African systems of thought, in reputable international journals. His first major book, Philosophy and an African Culture (1980) is truly remarkable for its eclectic range of interests. Paulin Hountondji, Wiredu’s great contemporary from the Republic of Benin, for many years had to deal with charges that his philosophically impressive corpus lacked ideological content and therefore merit from critics such as Olabiyi Yai (1977). Hountondji (1983; 2002) in those times of extreme ideologizing, never avoided the required measure of socialist posturing. Wiredu, on the other hand, not only avoided the lure of socialism but went on to denounce it as an unfit ideology. Within the context of the socio-political moment of that era, it seemed a reactionary—even injurious—posture to adopt. Nonetheless, he had not only laid the foundations of his project of conceptual decolonization at the theoretical level but had also begun to explore its various practical implications by his analyses of concepts such as “truth,” and also by his focused critique of some of the more counter-productive impacts of both colonialism and traditional culture.

By conceptual decolonization, Wiredu advocates a re-examination of current African epistemic formations in order to accomplish two objectives. First, he wishes to subvert unsavoury aspects of indigenous traditions embedded in modern African thought so as to make it more viable. Second, he intends to undermine the unhelpful Western epistemologies to be found in African philosophical traditions. On this important formulation of his he states:

By this I mean the purging of African philosophical thinking of all uncritical assimilation of Western ways of thinking. That, of course, would be only part of the battle won. The other desiderata are the careful study of our own traditional philosophies and the synthesising of any insights obtained from that source with any other insights that might be gained from the intellectual resources of the modern world. In my opinion, it is only by such a reflective integration of the traditional and the modern that contemporary African philosophers can contribute to the flourishing of our peoples and, ultimately, all other peoples. (Oladipo, 2002: 328)

In spite of his invaluable contributions to modern African thought, it can be argued that Wiredu’s schema falls short as a feasible long term epistemic project. Due to the hybridity of the postcolonial condition, projects seeking to retrieve the precolonial heritage are bound to be marred at several levels. It would be an error for Wiredu or advocates of his project of conceptual decolonization to attempt to universalize his theory since, as Ngugi wa Thiongo argues, decolonization is a vast, global enterprise. Rather, it is safer to read Wiredu’s project as a way of articulating theoretical presence for the de-agentialized and deterritorialized contemporary African subject. In many ways, his project resembles those of Ngugi wa Thiongo and Cheikh Anta Diop. Ngugi wa Thiongo advocates cultural and linguistic decolonization on a global scale and his theory has undergone very little transformation since its formulation in the 1960s. Diop advances a similar set of ideas to Wiredu on the subject of vibrant modern African identities. Wiredu’s project is linked in conceptual terms to the broader project of political decolonization as advanced by liberationist African leaders such as Julius Nyerere, Jomo Kenyatta, Kwame Nkrumah, and Nnamdi Azikiwe. But what distinguishes the particular complexion of his theory is its links with the Anglo-Saxon analytic tradition. This dimension is important in differentiating his project from those of his equally illustrious contemporaries such as V. Y. Mudimbe and Paulin Hountondji. In fact, it can be argued that Wiredu’s theory of conceptual decolonization has more similarities with Ngugi wa Thiongo’s ideas regarding African cultural and linguistic agency than Mudimbe’s archeological excavations of African traces in Western historical and anthropological texts.

3. Decolonization as Epistemological Practice

In all previously colonized regions of the world, decolonization remains a topic of considerable academic interest. Wiredu’s theory of conceptual decolonization is essentially what defines his attitudes and gestures towards the content of contemporary African thought. Also it is an insight that is inflected by years of immersion into British analytic philosophy. Wiredu began his reflections of the nature, legitimate aims, and possible orientations in contemporary African thought not as a result of any particular awareness of the trauma or violence of colonialism or imperialism but by a confrontation with the dilemma of modernity by the reflective (post)colonial African consciousness. This dialectic origin can be contrasted with those of his contemporaries such as Paulin Hountondji and V. Y. Mudimbe.

Despite criticisms regarding some aspects of his work, in terms of founding a tradition for the practice of modern African philosophy, Wiredu’s contributions have been pivotal. He has also been very consistent in his output and the quality of his reflections regardless of some of their more obvious limitations.

As noted earlier, Wiredu was trained in a particular tradition of Western philosophy: the analytic tradition. This fact is reflected in his corpus. A major charge held against him is that his contributions could be made even richer if he had grappled with other relevant discourses: postcolonial theory, African feminisms, contemporary Afrocentric discourses and the global dimensions of projects and discourses of decolonization.

Kwasi Wiredu’s interests and philosophical importance are certainly not limited to conceptual decolonization alone. He has offered some useful insights on Marxism, mysticism, metaphysics, and the general nature of the philosophical enterprise itself. Although his latter text, Cultural Universals and Particulars has a more Africa-centred orientation, his first book, Philosophy and an African Culture presents a wider range of discursive interests: a vigorous critique of Marxism, reflections on the phenomenon of ideology, analyses of truth and the philosophy of language, among other preoccupations. It is interesting to see how Wiredu weaves together these different preoccupations and also to observe how some of them have endured while others have not.

The volume Conceptual Decolonisation in African Philosophy is an apt summation of Wiredu’s philosophical interests with a decidedly African problematic while his landmark philosophical work, Philosophy and an African Culture, published first in 1980, should serve as a fertile source for more detailed elucidation.

In the second essay of Conceptual Decolonisation in African Philosophy entitled “The Need for Conceptual Decolonisation in African Philosophy”, Wiredu writes that “with an even greater sense of urgency the intervening decade does not seem to have brought any indications of a widespread realization of the need for conceptual decolonisation in African philosophy” (Wiredu, 1995: 23). The intention at this juncture is to examine some of the ways in which Wiredu has been involved in the daunting task of conceptual decolonization. Decolonization itself is a problematic exercise because it necessitates the jettisoning of certain conceptual attitudes that inform one’s worldviews. Secondly, it usually entails an attempt at the retrieval of a more or less fragmented historical heritage. Decolonization in Fanon’s conception entails this necessity for all colonized peoples and, in addition, it is “a programme of complete disorder” (Fanon, 1963:20). This understanding is purely political and has therefore, a practical import. This is not to say that Fanon had no plan for the project of decolonization in the intellectual sphere. Also associated with this project as it was then conceived was a struggle for the mental liberation of the colonized African peoples. It was indeed a program of violence in more senses than one.

However, with Wiredu, there isn’t an outright endorsement of violence, as decolonization in this instance amounts to conceptual subversion. As a logical consequence, it is necessary to stress the difference between Fanon’s conception of decolonization and Wiredu’s. Fanon is sometimes regarded as belonging to the same philosophical persuasion that harbours figures like Nkrumah, Senghor, Nyerere and Sekou Toure, “the philosopher-kings of early post-independence Africa” (Wiredu,1995:14), as Wiredu calls them. This is so because they had to live out the various dramas of existence and the struggles for self and collective identity at more or less the same colonial/postcolonial moment. Those “spiritual uncles” of professional African philosophers were engaged, as Wiredu states, in a strictly political struggle, and whatever philosophical insight they possessed was put at the disposal of this struggle, instead of a merely theoretical endeavour. Obviously, Fanon was the most astute theoretician of decolonization of the lot. In addition, for Fanon and the so-called philosopher-kings, decolonization was invested with a pan-African mandate and political appeal. This crucial difference should be noted alongside what shall soon be demonstrated to be the Wiredu conception of decolonization. Africans, generally, will have to continue to ponder the entire issue of decolonization as long as unsolved questions of identity remain and the challenges of collective development linger. This type of challenge was foreseen by Fanon.

The end of colonialism in Africa and other Third World countries did not entail the end of imperialism and the dominance of the metropolitan countries. Instead, the dynamics of dominance assumed a more complex, if subtle, form. African economic systems floundered alongside African political institutions, and, as a result, various crises have compounded the seemingly perennial issue of underdevelopment.

A significant portion of post-colonial theory involves the entry of Third World scholars into the Western archive, as it were, with the intention of dislodging the erroneous epistemological assumptions and structures regarding their peoples. This, arguably, is another variant of decolonization. Wiredu partakes of this type of activity, but sometimes he carries the program even further. Accordingly, he affirms:

Until Africa can have a lingua franca, we will have to communicate suitable parts of our work in our multifarious vernaculars, and in other forms of popular discourse, while using the metropolitan languages for international communication. (Wiredu, 1995:20)

This conviction has been a guiding principle with Wiredu for several years. In fact, it is not merely a conviction; there are several instances within the broad spectrum of his philosophical corpus where he tries to put it into practice. Two of such attempts are his essays “The Concept of Truth in the Akan Language” and “The Akan Concept of Mind.” In the first of these articles, Wiredu states “there is no one word in Akan for truth” (Wiredu, 1985:46). Similarly, he writes, “another linguistic contrast between Akan and English is that there is no word “fact” (Ibid.). It is necessary to cite the central thesis of the essay; Wiredu writes that he wants “to make a metadoctrinal point which reflection on the African language enables us to see, which is that a theory of truth is not of any real universal significance unless it offers some account of the notion of being so” (Ibid.).

Wiredu’s argument here, needs to be firmer. In many respects, he is only comparing component parts of the English language with the Akan language and not always with a view to drawing out “any real universal significance” as he says. The entire approach seems to be irrevocably restrictive. This is the distinction that lies between an oral culture and a textual one. Most African intellectuals usually gloss over this difference, even though they may acknowledge it. The difference is indeed very significant, because of the numerous imponderables that come into play. Abiola Irele has been able to demonstrate the tremendous significance of orality in the constitution of modern African forms of literary expression.

However, Wiredu is more convincing in his essay “Democracy and Consensus in African Traditional Politics: A Plea for a Non-Party Polity”. In this essay, Wiredu argues that the:

Ashanti system was a consensual democracy. It was a democracy because government was by the consent, and subject to the control, of the people as expressed through the representatives. It was consensual because, at least, as a rule, that consent was negotiated on the principle of consensus. (By contrast, the majoritarian system might be said to be, in principle, based on consent without consensus.) (Ibid. pp58-59)

When Wiredu broaches the issue of politics and its present and future contexts in postcolonial Africa, then we are compelled to visit a whole range of debates and discourses especially in the social sciences in Africa. These arearguably more directly concerned with questions pertaining to governance, democracy, and the challenges of contemporary globalization.

Another essay by Wiredu, entitled “The Akan Concept of Mind” is also an attempt of conceptual recontextualization. Wiredu begins by stating that he is restricting himself to a study of the Akans of Ghana in order “to keep the discussion within reasonable anthropological bounds” (Wiredu, 1983:113). His objective is a modest but nevertheless important one, since it fits quite well with his entire philosophical project which, as noted, is concerned with ironing out philosophical issues “on independent grounds” and possibly in one’s own language and the metropolitan language bequeathed by the colonial heritage.

It is therefore appropriate to proceed gradually, traversing the problematic interfaces between various languages in search of satisfactory structures of meaning. The immediate effect is a radical diminishing of the entire concept of African philosophy, a term which under these circumstances would become even more problematic. The consequence of Wiredu’s position is that to arrive at the essence of African philosophy, it would be necessary to dismantle its monolithic structure to make it more context-bound. First, Africa as a spatial entity would require further re-drawing of its often problematic geography. Second, a new thematics to mediate between the general and the particular would have to be found. Third, the critique of unanimism and ethnophilosophy would be driven into more contested terrains. These are some of the likely challenges posed by Wiredu’s approach.

Furthermore, in dealing with the traditional Akan conceptual system, or any other, for that matter, it should be borne in mind that what is in contention is “a folk philosophy, a body of originally unwritten ideas preserved in the oral traditions, customs and usages of a people” (Ibid.)_.

It would be appropriate to examine more closely his article “The Akan Concept of Mind”. Here, Wiredu enumerates the ways in which the English conception of mind differs markedly from that of the Akan, due in a large part to certain fundamental linguistic dissimilarities. He also makes the point that “the Akans most certainly do not regard mind as one of the entities that go to constitute a person” (Ibid. 121). It is significant to note this, but at the same time, it is difficult to imagine the ultimate viability of this approach. Indeed after reformulating traditional Western philosophical problems to suit African conditions, it remains to be seen how African epistemological claims can be substantiated using the natural and logical procedures available to African systems of thought. As such, it is possible to argue that this conceptual manoeuvre would eventually degenerate into a dead-end of epistemic nativism. These are the kinds of issues raised by Wiredu’s project.

As such, inherent in the thrust for complete decolonization is the presence of colonial violence itself. In addition, there is essentially a latent desire for epistemic violence, as well as difficulties concerning the negotiation of linguistic divides. In the following quotation, for example, Wiredu attempts to demonstrate the significance of some of those differences:

By comparison with the conflation of concepts of mind and soul prevalent in Western philosophy, the Akan separation of the “Okra” from “adwene” suggests a more analytical awareness of the sanctification of human personality. (Ibid.128)

It is necessary to substantiate more rigorously claims such as this because we may also be committing an error in establishing certain troublesome linguistic or philosophical correspondences between two disparate cultures and traditions.

Another crucial, if distressing, feature of decolonization as advanced by Wiredu is that it always has to measure itself up with the colonizing Other, that is, it finds it almost impossible to create its own image so to speak by the employment of autochthonous strategies. This is not to assert that decolonization always has to avail itself of indigenous procedures, but the very concept of decolonization is in fact concerned with breaking away from imperial structures of dominance in order to express a will to self-identity or presence. To be sure, the Other is always present, defacing all claims to full presence of the decolonizing subject. This is a contradictory but inevitable trope within the postcolonial condition. The Other is always there to present the criteria by which self-identity is adjudged either favourably or unfavourably. There is no getting around the Other as it is introduced in its own latent and covert violence, in the hesitant counter-violence of the decolonizing subject and invariably in the counter-articulations of all projects of decolonization.

4. Tradition, Modernity and the Challenges of Development

Wiredu’s later attempts at conceptual decolonization have been quite interesting. An example of such an attempt is the essay “Custom and Morality: A Comparative Analysis of some African and Western Conceptions of Morals.” He is able to explore at greater length some of the conceptual confusions that arise as a result of the transplantation of Western ideas within an African frame of reference. This wholesale transference of foreign ideas and conceptual models has caused the occurrence of severe cases of identity crises and, to borrow a more apposite term, colonial mentality. Indeed, one of the aims of Wiredu’s efforts at conceptual decolonization is to indicate instances of colonial mentality and determine strategies by which they can be minimized. Accordingly he is quite convincing when he argues that polygamy in a traditional setting amounts to efficient social thinking but is most inappropriate within a modern framework. In this way, Wiredu is offering a critique of a certain traditional practice that ought to be discarded on account of the demands and realities of a modern economy.

On another level, it appears that Wiredu has not sufficiently interrogated the distance between orality and textuality. If indeed he has done so, he would be rather more skeptical about the manner in which he thinks he can dislodge certain Western philosophical structures embedded in the African consciousness.

Wiredu has always believed that traditional modes of thought and folk philosophies should be interpreted, clarified, analyzed and subjected to critical evaluation and assimilation (Wiredu, 1980: x). Also, at the beginning of his philosophical reflections, he puts forth the crucial formulation that there is no reason why the African philosopher “in his philosophical meditations […] should not test formulations in those against intuitions in his own language” (Wiredu, 1980: xi). And, rather than merely discussing the possibilities for evolving modern traditions in African philosophy, African philosophers should actually begin to do so (Hountondji, 1983). In carrying out this task, the African philosopher has a few available methodological approaches. First, he is urged to “acquaint himself with the different philosophies of the different cultures of the world, not to be encylopaedic or eclectic, but with the aim of trying to see how far issues and concepts of universal relevance can be disentangled from the contingencies of culture” (Wiredu, 1980: 31). He also adds that “the African philosopher has no choice but to conduct his philosophical inquiries in relation to the philosophical writings of other peoples, for his ancestors left him no heritage of philosophical writings” (Wiredu, 1980: 48). For Wiredu, the use of translations is a fundamental aspect of contemporary African philosophical practices. However, on the dilemmas of translation in the current age of neoliberalism, it has been noted: “translations are [..] put ‘out of joint.’ However correct or legitimate they may be, and whatever right one may acknowledge them to have, they are all disadjusted, as it were unjust in the gap that affects them. This gap is within them, to be sure, because their meanings remain necessarily equivocal; next it is in the relation among them and thus their multiplicity, and finally or first of all in the irreducible inadequation to the other language and to the stroke of genius of the event that makes the law, to all the virtualities of the original” (Derrida, 1994:19). Wiredu does not contemplate the implications of this kind of indictment in his formulations of an approach to African philosophy. Perhaps the task at hand is simply too important and demanding to cater to such philosophical niceties. In relation to the kind of philosophical heritage at the disposal of the African philosopher, Wiredu identifies three main strands; “a folk philosophy, a written traditional philosophy and a modern philosophy” (Wiredu, 1980:46). Wiredu’s approach to questions of this sort is embedded in his general theoretical stance: “It is a function, indeed a duty, of philosophy in any society to examine the intellectual foundations of its culture. For any such examination to be of any real use, it should take the form of reasoned criticism and, where possible, reconstruction. No other way to philosophical progress is known than through criticism and adaptation” (Wiredu, 1980: 20).

The drive to attain progress is not limited to philosophical discourse alone. Entire communities and cultures usually aim to improve upon their institutions and practices in order to remain relevant. Societies can lose the momentum of growth and “various habits of thought and practice can become anachronistic within the context of the development of a given society; but an entire society too can become anachronistic within the context of the whole world if the ways of life within it are predominantly anachronistic. In the latter case, of course, there is no discarding society; what you do is to modernize it” (Wiredu, 1980:1). The theme of modernization occurs frequently in Wiredu’s corpus. He does not fully conceptualize it nor relate it to the various ideological histories it has encountered in the domains of social science, where it became a fully fledged discipline. Modernization, for him, is based on an uncomplicated pragmatism that owes much to Deweyan thought.

This kind of posture, that is, the consistent critique of the retrogression inherent in tradition and its proclivity for the fossilization of culture, is directed at Leopold Sedar Senghor. On Senghor, he writes, “it is almost as if he has been trying to exemplify in his own thought and discourse the lack of the analytical habit which he has attributed to the biology of the African. Most seriously of all, Senghor has celebrated the fact our (traditional) mind is of a non-analytical bent; which is very unfortunate, seeing that this mental attribute is more of a limitation than anything else” (Wiredu, 1980:12). Wiredu’s main criticism of Senghor is one that is always leveled against the latter. Apart from that charge that Senghor essentializes the concept and ideologies of blackness, he is also charged with defeatism that undermines struggles for liberation and decolonization. However, Paul Gilroy has unearthed a more sympathetic context in which to read and situate Senghorian thought. In Gilroy’s reading, an acceptable ideology of blackness emerges from Senghor’s work. And in this way, Wiredu’s critique loses some of its originality.

Senghor is cast as a traditionalist and tradition itself is the subject of a much broader critique. On some of the drawbacks of tradition Wiredu writes,

it is as true in Africa as anywhere else that logical, mathematical, analytical, experimental procedures are essential in the quest for the knowledge of, and control over, nature and therefore, in any endeavour to improve the condition of man. Our traditional culture was somewhat wanting in this respect and this is largely responsible for the weaknesses of traditional technology, warfare, architecture, medicine….” (Wiredu, 1980: 12) (italics mine)

Sometimes, Wiredu carries his critique of tradition too far as when he advances the view that “traditional medicine is terribly weak in diagnosis and weaker still in pharmacology” (Wiredu, 1980: 12). In recent times, a major part of Hountondji’s project is to demonstrate that traditional knowledges are not only useful and viable but also the necessity to situate them in appropriate modern contexts. Hountondji’s latest gesture is curious since both he and Wiredu are supposed to belong to the same philosophic tendency as described by Bodunrin under the rubric of West-led universalism. However, Wiredu’s attack on tradition is vitiated by his project of conceptual decolonization which, in order to work, requires the recuperation of vital elements in traditional culture.

Wiredu’s stance in relation to modernization and tradition gets refined by his condemnation of some aspects of urban existence which exhibit a manifestation of postmodern environmentalism. First, he writes, “it is quite clear to me that unrestricted industrial urbanization is contrary to any humane culture; it is certainly contrary to our own” (Wiredu, 1980:22). Also, “one of the powerful strains on our extended family system is the very extensive poverty which oppresses out rural populations. Owing to this, people working in the towns and cities are constantly burdened with the financial needs of rural relatives which they usually cannot entirely satisfy”(Wiredu, 1980:22). Contemporary anthropological studies dealing with Africa have dwelt extensively on this phenomenon. The point is, in Africa, forms of sociality exists that can no longer be found in the North Atlantic civilization. If this civilization (the North Atlantic) is characterized by extreme individualism, African forms of social existence on the other hand tend towards the gregarious in which conceptions of generosity, corruption, gratitude, philanthropy, ethnicity and even justice take on different slightly forms from what obtains within the vastly different North Atlantic context.

Also problematic is Wiredu’s reading of colonialism which is very similar to those of authors such as Ngugi wa Thiongo, Walter Rodney or even Chinua Achebe. In this reading, the colonized is abused, brutalized, silenced and reconstructed against her/his own will. Colonialism causes the destruction of agency. On de-agentialization, Wiredu states, “any human arrangement is authoritarian if it entails any person being made to do or suffer something against his will, or if it leads to any person being hindered in the development of his own will” (Wiredu, 1980:2). Homi Bhabha advances the notion of ambivalence to highlight the cultural reciprocities inherent in the entire colonial encounter and structure. This kind of reading of the colonial event has led to a rethinking of colonial theory. But Wiredu’s reading of the colonial encounter is infected by the radical persuasion of early African theorists of decolonization: “The period of colonial struggle was […] a period of cultural affirmation. It was necessary to restore in ourselves our previous confidence which had been so seriously eroded by colonialism. We are still, admittedly, even in post-colonial times, in an era of cultural self-affirmation” (Ibid.59).

5. An African Reading of Karl Marx

Marxist theory and discourse generally provided many African intellectuals with a platform on which to conduct many sociopolitical struggles. In fact, for many African scholars, it served as the only ideological tool. But not all scholars found Marxism acceptable. Wiredu was one of the scholars who has deep reservations about it. But he is not in doubt about the philosophical significance of Marx: “I regard Karl Marx as one of the great philosophers” (Wiredu, 1980:63). Derrida is even more forthcoming on the depth of this significance: “It will always be a fault not to read and reread and discuss Marx- which is to say also a few others- and to go beyond scholarly “reading” or “discussion.” It will be more and more a fault, a failing of theoretical, philosophical, political responsibility” (Derrida, 1994:13). Again, he writes, “the Marxist inheritance was- and still remains, and so it will remain- absolutely and thoroughly determinate. One need not be a Marxist or a communist in order to accept this obvious fact. We all live in a world, some would say a culture, that bears, at an incalculable depth, the mark of this inheritance, whether in a directly visible fashion or not”(Ibid.).

Marxism during era of the Cold War was the major ideological issue and in the present age of neoliberalism it continues to haunt (Derrida’s precise phrase is hauntology) us with its multiple legacies. Wiredu’s critique of Marx and Engels is located within the epoch of the Cold War. But from it, we get a glimpse of not only his political orientation but also his philosophical predilections. For instance, at a point, he claims “the food one eats, the hairstyle one adopts, the amount of money one has, the power one wields- all these and such circumstances are irrelevant from an epistemological point of view” (Wiredu, 1980:66). But Foucault-style analyses have demonstrated that these seemingly marginal activities have a tremendous impact on knowledge/power configurations that are often difficult to ignore. Michel de Certeau has demonstrated these so-called inconsequential acts become significant as gestures of resistance for the benefit of the weak and politically powerless. In his words, “the weak must continually turn to their own ends forces alien to them” (de Certeau 1984: xix). On those specific acts of the weak, he writes, “many everyday practices (talking, reading, moving about, shopping, cooking, etc.) are tactical in character. And so are, more generally, many “ways of operating”: victories of the “weak” over the “strong” (whether the strength be that of powerful people or the violence of things or of an imposed order, etc.), clever tricks, knowing how to get away with things, “hunter’s cunning,” maneuvers, polymorphic simulations, joyful discoveries, poetic as well as warlike. The Greeks called these “ways of operating” metis (Ibid.). This reading gives an entirely different perspective on acts and themes of resistance as panoptical surveillance in the age of global neoliberalism becomes more totalitarian in nature at specific moments.

As a philosopher versed in analytic philosophy, truth is a primary concern of Wiredu and this concern is incorporated into his analysis of Marxist philosophy. Hence, he identifies the following points, “the cognition of truth is recognized by Engels as the business of philosophy; (2) What is denied is absolute truth, not truth as such; (3) The belief, so finely expressed, in the progressive character of truth; (4) Engels speaks of this process of cognition as the ‘development of science.’ (5) That a consciousness of limitation is a necessary element in all acquired knowledge” (Wiredu,1980:64-65). Wiredu explains that these various Marxian assertions on truth are no different from those of the logician, C. S. Peirce who had expounded them under a formulation he called “fallibilism.” John Dewey also expounded them under the concept of ‘pragmatism’(Ibid.67). So the point here is that some of the main Marxist propositions on truth have parallels in analytic philosophy. Nonetheless, this raises an unsettling question about Marxism and its relation to truth: “How is it that a philosophy which advocates such an admirable doctrine as the humanistic conception of truth tends so often to lead in practice to the suppression of freedom of thought and expression? Is it by accident that this comes to be so? Or is it due to causes internal to the philosophy of Marx and Engels”(Ibid.68). Wiredu demonstrates strong reservations about what Ernest Wamba dia Wamba calls ‘bureaucratic socialism.” Derrida on his part, urges us to distinguish between Marx as a philosopher and the innumerable specters of Marx. In other words, there is a difference between “the dogma machine and the “Marxist” ideological apparatuses (States, parties, cells, unions, and other places of doctrinal production)”(Derrida,1994:13) and the necessity to treat Marx as a great philosopher. We need to “try to play Marx off against Marxism so as to neutralize, or at any rate muffle the political imperative in the untroubled exegesis of classified work” (Ibid.31). We also need to remember that “he doesn’t belong to the communists, to the Marxists, to the parties, he ought to figure within our great canon of […] political philosophy” (Ibid.31).

Wiredu’s reading of Marxism generally is quite damaging. First, he states, “Engels himself, never perfectly consistent, already compromises his conception of truth with some concessions to absolute truth in Anti-Duhring” (Wiredu, 1980:68). He then makes an even more damaging accusation that a form of authoritarianism lies at the heart of conception of philosophy propagated by Marx and Engels. On what he considers to a deep-seated confusion in their work, he writes, “Engels recognizes the cognition of truth to be a legitimate business of philosophy and makes a number of excellent points about truth. As soon, however, as one tries to find out what he and Marx conceived philosophy to be like, one is faced with a deep obscurity. The problem resolves round what one may describe as Marx’s conception of philosophy as ideology” (Ibid.70). Here, Wiredu makes the crucial distinction between Marx as a philosopher and the effects of his numerous spectralities and for this reason he offers his most important criticism of his general critique of Marxism. He also accuses Marx of instances of “carelessness in the use of cardinal terms” which he says “may be symptomatic of deep inadequacies of thought”(Ibid.74). This charge, which relates to Marx’s conception of consciousness is indeed serious since it borders on the question of conceptual clarification as advanced by the canon of analytic philosophy. Wiredu argues that Marx and Engels are unclear about their employment of the concept of ideology: “Marx and Engels are […] on the horns of a dilemma. If all philosophical thinking is ideological, then their thinking is ideological and, by their hypothesis, false”(Ibid.76). Wiredu’s insights are very important here: “He and Engels simply assumed for themselves the privilege of exempting their own philosophizing from the ideological theory of ideas”(Ibid.77). Consequently, Marx commits a grave error “in his conception of ideology and its bearing upon philosophy”(Ibid.81).

Another area Wiredu finds Marx and Engels wanting is moral philosophy. In other words, Marx “confused moral philosophy with moralism and assumed rather than argued a moral standpoint”(Ibid.79). Furthermore, he had precious little to say on the nature of the relationship between philosophy and morality. Engels does better on this score as there is a treatment of morality in Anti-Duhring. Nonetheless, Engels is charged with giving “no guidance on the conceptual problems that have perplexed moral philosophers” (Ibi.80). Henceforth, Wiredu becomes increasing dismissive of Marx, Marxism and its followers. First, he writes, “the run-of the-mill Marxists, even less enamoured of philosophical accuracy than their masters, have made the ideological conception of philosophy a battle cry”(Ibid.80). And then he singles out ‘scientific socialism’ which he regards as being unclear in its elaboration and which he typifies as “an amalgam of factual and evaluative elements blended together without regard to categorical stratification”(Ibid.85). In one of his most damaging assessments of Marxism, he declares: “Ideology is the death of philosophy. To the extent to which Marxism, by its own internal incoherences, tends to be transformed into an ideology, to that extent Marxism is a science of the unscientific and a philosophy of the unphilosophic” (Ibid.87).

In sum, Wiredu general attitude towards Marxism is one of condemnation. However, in the contemporary re-evaluations of Marxism a few discursive elements need to be clarified; the inclusion of the demarcation of Cold War and post Cold War assessments of Marxism ought to be employed as an analytical yardstick and also the necessity to sift through the various specters and legacies of Marx as distinct from those of Marxism. This is the kind of reading that Derrida urges us to do and it is also one to which we shall now turn our attention.

Derrida states it is imperative to distinguish between the legacies of Marx and the various spectralities of Marxism. In addition to this distinction we might add another crucial one: analyses of Marxism before and after the fall of the former Soviet Union. Wiredu’s critique is based on the pre-Soviet debacle whilst Derrida’s draws some of his reflections based on the post-Soviet fall. In these two different critiques, we must be careful to always strive to isolate the theoretical elements and insights that bypass short-lived discursive trends and political interests which often tend to vitiate the more profound effects of the works of Karl Marx and those that do not.

The debacle of the former Soviet Union and the apparent hegemony of neoliberal ideology have generated discourses associated with the “ends” of discourse. But Derrida points out that there is nothing new in the contemporary proclamations affirming the end of discourses which are in fact anachronistic when compared to the earlier versions of the same discursive orientation that emerged in the 1950s and which in a vital sense owed a great deal to a certain spirit of Marx: “the eschatological themes of the “end of history,” of the “end of Marxism,” of the “end of philosophy,” of the “ends of man,” of the “last man” and so forth were, in the ‘50s, that is, forty years ago our daily bread. We had this bread of apocalypse in our mouths naturally, already, just as naturally as that which I nicknamed after the fact, in 1980, the “apocalyptic tone in philosophy” (Derrida, 1994:14-15). In a way, in fact the contemporary discourses of endism that draw from the spirit of neoliberal triumphalism, without acknowledging it, are greatly indebted to Marxism and the more constructive critiques of it. Deconstruction, in part, emerged from the necessity to critique the various forms of statist Stalinism, the numerous socio-economic failings of Soviet bureaucracy and the political repression in Hungary. In other words, it emerged partly from the need to organize critiques for degraded forms of socialism.

In speaking about the inheritance of Marx, Derrida also reflects on the injunction associated with it. The task of reflecting on this inheritance and the injunction to which it gives rise is demanding: … “one must filter, sift, criticize, one must sort out several different possibles that inhabit the same injunction. And inhabit it in a contradictory fashion around a secret. If the readability of a legacy were given, natural, transparent, univocal, if it did not call for and at the same time defy interpretation, we would never have anything to inherit from it” (Ibid.16). Derrida’s employment of terms and phrases such “inheritance,” “injunction,” and the “spectrality of the specter” in relation to the legacies of Marx has to do with the question of the genius of Marx: “Whether evil or not, a genius operates, it always resists and defies after the fashion of a spectral thing. The animated work becomes that thing, the Thing that, like an elusive specter, engineers [s’ingenie] a habitation without proper inhabiting, call it is a haunting, of both memory and translation” (Ibid.18).

A work of genius, a masterpiece in addition to giving rise to spectralities also generates legions of imitators and followers. Of the Marxists who came after Marx, Wiredu writes; “I find that Marxists are especially prone to confuse factual with ideological issues. Undoubtedly, the great majority of those who call themselves Marxists do not share the ideology of Marx”(Wiredu,1980:94). In order to transcend the violence and confusion of Marxists who misread Marx, we need “to play Marx off against Marxism so as to neutralize, or at any rate muffle the political imperative in the untroubled exegesis of a classified work”(Derrida,1994:31). The work of re-reading Marx, of re-establishing his philosophical value and importance is a task needs to be performed in universities, conferences, colloquia and also in less academic sites and fora.

Within the contemporary cultural moment, new configurations have arisen that were not present during Marx’s day. Indeed, “a set of transformations of all sorts (in particular, techno-scientific-economic-media) exceeds both the traditional givens of the Marxist discourse and those of the liberal discourse opposed to it”(Ibid.70). Also,

Electoral representativity or parliamentary life is not only distorted, as was always the case, by a great number of socio-economic mechanisms, but it is exercised with more and more difficulty in a public space profoundly upset by techno-tele-media apparatuses and by new rhythms of information and communication, by the devices and the speed of forces represented by the latter, but also and consequently by the new modes of appropriation they put to work, by the new structure of the event and of its spectrality that they produce.” (Ibid.79)

Here, the instructive point is that the new information technologies have radically transformed the possibilities of the event and the modes of its production, reception and also interpretation. But there is a far more radical change that has occurred and which signals a profound crisis of global capitalism and the neoliberal ideology that underpins it: “For what must be cried out, at a time when some have the audacity to neo-evangelize in the name of the ideal of liberal democracy that has finally realized itself as the ideal of human history: never have violence, inequality, exclusion, famine, and thus economic oppression affected as many human beings in the history of the earth and of humanity”(Ibid.85). Also, “never have so many men, women, and children been subjugated, starved, or exterminated on the earth.” (Ibid.)

So Derrida identifies a few new factors that need to be included in the critique of Marxism in the contemporary moment namely the phenomenon of spectralization caused by techno-science and digitalization, the weakening of the practice of liberal democracy and also the crises and multiple contradictions inherent in global capitalism. It is necessary to include another element into the present configuration which is the rise of political Islam as an alternative ideology, its subsequent fervent politicization and its Western reconstruction into an ideology of terror.

Wiredu’s reading of Marx focuses on the conceptual infelicities in the latter’s theorizations of notions such as “ideology,” “consciousness,” and “truth.” Wiredu also criticizes Marx’s project of moral philosophy or in fact the lack of it. On the whole, his reading isn’t complementary. Indeed, it amounts to a dismissal of Marx in spite of the attempt to read him without the obfuscations of innumerable legacies.

6. Conclusion

Arguably, Wiredu’s particular contribution to the debate on the origins, status, problematic and future of contemporary African philosophy resides in his formulations regarding his theory of conceptual decolonization. His approach in formulating this theory of discursive agency and more specifically philosophical practice involves the incorporation of a form bi-culturalism. In other words, his approach entails analyses of the canon of Western philosophy and also the manifestations of tribal cultures as a way of attaining a conceptual synthesis. Indeed, this schema involves a forceful element of bi-culturalism as a matter of logical consequence as well as a high level of [multi] bi-lingual competence. As such, it not only an exercise in conceptual synthesis but it is also a project involving comparative linguistics.

In Anglophone parts of Africa, Wiredu’s experience and research in teaching African philosophy has had a tremendous significance. The positive aspect of this is that the study of African philosophical thought has in positive moments transcended the problematic of identity or what has been termed as the problematic of origins. The less complimentary dimension of this equation is that Wiredu’s discoveries have given rise to (most undoubtedly unwittingly) a somewhat hegemonic school of disciples that is fostering a delimiting academicism and which is contrary to his essential spirit of conceptual inventiveness. As such, it might become necessary not only to critique Wiredu’s corpus but perhaps also Wiredu’s school of disciples which rather than appreciate the originality of his formulations fall instead for the pitfalls of over-ideologization.

Undoubtedly, Wiredu discovered a challenging path in modern African thought in which he sometimes takes the meaning of the existence of African philosophy for granted. In addition, it has been observed that also lacking at some moments in his oeuvre is an attempt to de-totalize and hence particularize the components of what he regards of the foundations of African philosophy. In other words, African philosophy finds its form, shape and also its conceptual moorings above the discursive platform provided by Western philosophy. In addition, the theoretical space made available for its articulation is derived from the same Western-donated pool of unanimism. Part of recent interrogations of Wiredu’s work includes a questioning of the legitimacy of that space as the only site on which to construct an entire philosophical practice for the alienated, hybrid African consciousness. Oftentimes the question is posed, what are the ways by which the space can be broadened?

Indeed, terms such as reflective integration and due reflection offer the critical spaces for the theoretical articulation of something whose existence has not yet been concretely conceived. So in Wiredu’s corpus we see the very familiar problematic involving the tradition/modernity dichotomy being played out. Finally, it can be argued that this tension is not quite resolved but fortunately it is also a tension that never jeopardizes his philosophical inventiveness. Rather, it seems to animate his reflections in unprecedented ways.

7. References and Further Reading

Cronon, D. E. 1955. Black Moses: The Story of Marcus Garvey and the Universal Negro Improvement Association, Wisconsin: University of Wisconsin Press.
Cummings, Robert. 1986. “Africa between the Ages” in African Studies Review, Vol. 29, No. 3, September.
Diop, Cheikh, Anta, 1974. The African Origin of Civilization: Myth or Reality? Trans. M. Cook, Westport, Conn.: Lawrence Hill.
Doortmont, Michel R. 2005 The Pen-Pictures of Modern Africans and African Celebrities by Charles Francis Hutchison, Leiden and Boston: Brill.
Dubow, Saul. 2000 The African National Congress, Johannesburg: Jonathan Ball.
Derrida, Jacques. 1994. Specters of Marx: the state of the debt, the work of mourning, & the new international, trans. Peggy Kamuf, New York: Routledge.
Gates Jr., H. L. 1992. Loose Canons, New York: OxfordUniversity Press.
Fanon, Frantz. 1967 Black Skin, White Masks (trans. C. Van Markmann) New York: Grove Press.
Fanon, Frantz. 1963 The Wretched of the Earth, London: Penguin.
Foucault, Michel. 1974 The Order of Things: An Archaeology of the Human Sciences. New York: Pantheon.
Foucault, Michel. 1977 Discipline and Punish: The Birth of the Prison. Trans A. M. Sheridan-Smith. London: Allen Lane.
Foucault, Michel. 1980 Language, Counter-Memory and Practice. Selected Essays and Interviews. Ed. Donald Bouchard, Ithaca, NY: CornellUniversity Press.
Foucault, Michel. 1982 The Archaeology of Knowledge. New York: Pantheon.
Foucault, Michel. 1991 “Governmentality” in G. Burchell, C. Gordon and P. Miller, eds, The Foucault Effect.Chicago: Chicago University Press.
Hountondji, Paulin. 1983 African Philosophy: Myth and Reality, London: Hutchinson and Co.
Hountondji, Paulin. 2002 The Struggle for Meaning: Reflections on Philosophy, Culture and Democracy in Africa, Athens: Ohio University Center for International Studies.
Masolo, D.A. 1994 African Philosophy in Search of Identity Bloomington and Indianapolis: IndianaUniversity Press.
Mudimbe V.Y. 1988 The Invention of Africa Bloomington and Indianapolis: IndianaUniversity Press.
Mudimbe V.Y. 1994. The Idea of Africa,Bloomington and Indianapolis: IndianaUniversity Press.
Oladipo, Olusegun. ed. 2002 The Third Way in African Philosophy:Essays in Honour of Kwasi WireduIbadan: Hope Publications Ltd.
Osha, Sanya, 2005 Kwasi Wiredu and Beyond: The Text, Writing and Thought in Africa, Dakar: Codesria.
Soyinka, Wole, 1976 Myth, Literature and the African World Cambridge: Cambridge University Press.
Soyinka, Wole, 1988 Art, Dialogue and Outrage Ibadan: New Horn Press.
Soyinka, Wole, 1996 The Open Sore of a Continent New York: Oxford University Press.
Soyinka, Wole. 1999 The Burden of Memory, The Muse of Forgiveness New York: Oxford University Press.
Soyinka, Wole. 2000 “Memory, Truth and Healing” in The Politics of Memory, Truth, Healing and Social Justice, eds. I. Amaduime and A. An-Na’im, London: Zed Books
Wa Thiongo, Ngugi. 1972 HomecomingLondon, Ibadan, Lusaka: Heinemann.
Wa Thiongo, Ngugi. 1981 Writers in PoliticsNairobi: Heinemann.
Wa Thiongo, Ngugi. 1986 Decolonising the MindNairobi: E.A.E.P.
Wa Thiongo, Ngugi. 1993 Moving the CentreLondon: James Currey.
Wiredu, Kwasi. Philosophy and an African CultureCambridge: CambridgeUniversity Press, 1980.
Wiredu, Kwasi. 1983 “The Akan Concept of Mind” in Ibadan Journal of Humanistic Studies, No. 3.
Wiredu, Kwasi. 1985 “The Concept of Truth in Akan Language” in P.O. Bodunrin ed. Philosophy in Africa: Trends and Perspectives, Ile-Ife: University of Ife Press.
Wiredu, Kwasi. and Gyekye, Kwame. 1992 Persons and Community. Washington, D.C.: The Council for Research in Values and Philosophy.
Wiredu, Kwasi. 1993 “Canons of Conceptualisation” in The Monist: An International Journal of General Philosophical Inquiry Vol. 76, No. 4 October.
Wiredu, Kwasi. 1995 Conceptual Decolonization in African PhilosophyIbadan: Hope Publications.
Wiredu, Kwasi. 1996 Cultural Universals and ParticularsBloomington and Indianapolis: IndianaUniversity Press.
Yai, Olabiyi. 1977 “The Theory and Practice in African Philosophy: The Poverty of Speculative Philosophy,” Second Order: An African Journal of Philosophy, Vol.VI, No.2.

Author Information

Sanya Osha
Email: babaosha@yahoo.com
Tshwane University of Technology
South Africa

The Problem of the Criterion

The Problem of the Criterion is considered by many to be a fundamental problem of epistemology. In fact, Chisholm (1973, 1) claims that the Problem of the Criterion is “one of the most important and one of the most difficult of all the problems of philosophy.” A popular form of the Problem of the Criterion can be raised by asking two seemingly innocent questions: What do we know? How are we to decide in any particular case whether we have knowledge? One quickly realizes how troubling the Problem of the Criterion is because it seems that before we can answer the first question we must already have an answer to the second question, but it also seems that before we can answer the second question we must already have an answer to the first question. That is, it seems that before we can determine what we know we must first have a method or criterion for distinguishing cases of knowledge from cases that are not knowledge. Yet, it seems that before we can determine the appropriate criterion of knowledge we must first know which particular instances are in fact knowledge. So, we seem to be stuck going around a circle without any way of getting our epistemological theorizing started. Although there are various ways of responding to the Problem of the Criterion, the problem is difficult precisely because it seems that each response comes at a cost. This article examines the nature of the Problem and the costs associated with the most promising responses to the Problem.

The Problem
Chisholm on the Problem of the Criterion
Other Responses to the Problem of the Criterion
1. Explanationist Responses
2. Dissolution
The Problem of the Criterion’s Relation to Other Philosophical Problems
References and Further Reading

1. The Problem

The Problem of the Criterion is the ancient problem of the “wheel” or the “diallelus”. It comes to us from Book 2 of Sextus Empiricus’ Outlines of Pyrrhonism. Sextus presents the Problem of the Criterion as a major issue in the debate between the Academic Skeptics and the Stoics. After Sextus’ presentation though, philosophers largely seemed to lose interest in the Problem of the Criterion until the modern period. The problem resurfaced in the late 1500’s with Michael De Montaigne’s “Apology for Raymond Sebond” and it again had a significant influence. Following the modern period, however, the Problem of the Criterion largely disappeared until the early 19^th century when G.W.F. Hegel (1807) presented the problem and, arguably, put forward one of the first coherentist responses to the Problem of the Criterion (Rockmore (2006) and Aikin (2010)). In the late 19^th and early 20^th centuries Cardinal D.J. Mercier (1884) and his student P. Coffey (1917) again reminded the world of the problem. In the late 20^th century the Problem of the Criterion played an important role in the work of two philosophers: Roderick Chisholm and Nicholas Rescher. In fact, it is primarily due to the work of Roderick Chisholm that the Problem of the Criterion is discussed by contemporary epistemologists at all. (See Amico (1993) and Popkin (2003) for further discussion of the historical development of the Problem of the Criterion).

In light of Chisholm’s enormous influence on contemporary discussions of the Problem of the Criterion his presentation of the problem is a fitting place to begin getting clear on things. Chisholm (1973, 12) often introduces the Problem of the Criterion with the following pairs of questions:

(A) What do we know? What is the extent of our knowledge?

(B) How are we to decide whether we know? What are the criteria of knowledge?

However, Chisholm also speaks approvingly of Montaigne’s presentation of the Problem of the Criterion, which is in terms of true/false appearances rather than knowledge. Further, there is some ambiguity in Chisholm’s own discussions of the Problem of the Criterion as to whether the problem presented by the Problem of the Criterion is the meta-epistemological problem of determining when we have knowledge or the epistemological problem of determining what is true. So, there is a difficulty in determining exactly what problem the Problem of the Criterion is supposed to pose.

The fact that Chisholm’s discussion oscillates between these two versions of the Problem of the Criterion and the fact that he seems to be aware of the two versions of the problem help make it clear that perhaps there is no such thing as the Problem of the Criterion. Perhaps the Problem of the Criterion is rather a set of related problems. This is something that many philosophers since Chisholm, and Chisholm himself (see his 1977), have noted. For instance, Robert Amico (1993) argues that Chisholm mistakenly takes himself to be discussing the same problem as Sextus Empiricus when he considers the Problem of the Criterion. Richard Fumerton (2008) points out that there are at least two versions of the Problem of the Criterion. The first is a methodological problem of trying to identify sources of knowledge or justified belief (this, he claims, is the version of the problem that Chisholm focuses on). The second is the problem of trying to identify the necessary and sufficient conditions for correctly applying concepts such as ‘knowledge’ or ‘justification’. Michael DePaul (1988, 70) expresses a version of the Problem of the Criterion limited to moral discourse in terms of two questions: “Which of our actions are morally right?” and “What are the criteria of right action?”

Since there are many versions of the Problem of the Criterion, one might worry that it will be nearly impossible to formulate the Problem of the Criterion precisely. Fortunately, this is not the case. Although there are many particular instances of the Problem of the Criterion, they all seem to be questions of epistemic priority. In other words, the various versions of the Problem of the Criterion are focused on trying to answer the question “how is it possible to theorize in epistemology without taking anything epistemic for granted?” (Conee 2004, 17). More generally: how is it possible to theorize at all without making arbitrary assumptions? Hence, perhaps the best way to formulate the Problem of the Criterion in its most general form is with the following pair of questions (Cling (1994) and McCain and Rowley (2014)):

(1) Which propositions are true?

(2) How can we tell which propositions are true?

Plausibly, all the various formulations of particular versions of the Problem of the Criterion can be understood as instances of the problem one faces when trying to answer these general questions.

Before moving on it is important to be clear about the nature of (1) and (2). These are not questions about the nature of truth itself. Rather, these are epistemological questions concerning which propositions we should think are true and what the correct criteria are for determining whether a proposition should be accepted as true or false. It is possible that one could have answers to these questions without possessing any particular theory of truth, or even taking a stand at all as to the correct theory of truth. Additionally, it is possible to have a well-developed theory of the nature of truth without having an answer to either (1) or (2). So, the issue at the heart of the Problem of the Criterion is how to start our epistemological theorizing in the correct way, not how to discover a theory of the nature of truth.

Most would admit that it is important to start our epistemological theorizing in an appropriate way by not taking anything epistemic for granted, if possible. However, this desire to start theorizing in the right way coupled with the questions of the Problem of the Criterion does not yield a problem—it is merely a desire we have and questions we need to answer. The problem yielded by the Problem of the Criterion arises because one might plausibly think that we cannot answer (1) until we have an answer to (2), but we cannot answer (2) until we have an answer to (1). So, at least initially, consideration of the Problem of the Criterion makes it seem that we cannot get our theorizing started at all. This seems to land us in a pretty extreme form of skepticism—we cannot even begin the project of trying to determine which propositions to accept as true.

Of course, there are anti-skeptical ways to respond to the Problem of the Criterion. According to Chisholm, these anti-skeptical responses are question-begging. In light of this one might think that extreme skepticism is inevitable. However, this might not be correct. The extreme skepticism threatened by the Problem of the Criterion itself seems guilty of begging the question. This is why Chisholm (1973, 37) claims “we can deal with the problem only by begging the question.”

2. Chisholm on the Problem of the Criterion

According to Chisholm, there are only three responses to the Problem of the Criterion: particularism, methodism, and skepticism. The particularist assumes an answer to (1) and then uses that to answer (2), whereas the methodist assumes an answer to (2) and then uses that to answer (1). The skeptic claims that you cannot answer (1) without first having an answer to (2) and you cannot answer (2) without first having an answer to (1), and so you cannot answer either. Chisholm claims that, unfortunately, regardless of which of these responses to the Problem of the Criterion we adopt we are forced to beg the question. It will be worth examining each of the responses to the Problem of the Criterion that Chisholm considers and how each begs the question against the others.

The particularist assumes an answer to (1) that does not epistemically depend on an answer to (2) and uses her answer to (1) to answer (2). More precisely, the particularist response to the Problem of the Criterion is:

Particularism Assume an answer to (1) (accept some set of propositions as true) that does not depend on an answer to (2) and use the answer to (1) to answer (2).

What is the epistemic status of the particularist’s answer to (1)? Chisholm (1973, 37) seems to take it that its status is weak, being nothing more than an assumption:

But in all of this I have presupposed the approach I have called “particularism.” The “methodist” and the “skeptic” will tell us that we have started in the wrong place. If now we try to reason with them, then, I am afraid, we will be back on the wheel.

One might think that the question-begging only occurs if the particularist tries to reason with her methodist or skeptical interlocutors. So, one might think the problem for particularism is simply a lack of reasons in support of particularism that advocates of methodism or skepticism would accept.

However, things are worse than this. The real problem with particularism is not simply the dialectical problem of providing grounds that methodists and skeptics will accept; rather it is an epistemic problem. The problem with particularism is that the particularist’s starting point is an unfounded assumption. Particularism starts with a set of particular propositions and works from there. If the particularist goes beyond that set of particular propositions to provide reasons for accepting them, she abandons that particularist response and either picks a new set of particular propositions to assume (a new particularist response) or picks something other than simply a new set of only particular propositions to assume and ceases to be a particularist. So, the problem for the particularist response is much deeper than a dialectical problem that arises only when trying to deal with opposing views. The particularist cannot offer reasons for particularism beyond the unfounded assumption of a set of particular propositions. By simply assuming an answer to (1), the particularist begs the question against both the methodists and the skeptics.

Particularism is not unique in begging the question though. It seems that methodism begs the question too. The methodist response to the Problem of the Criterion is:

Methodism Assume an answer to (2) (accept some criterion to be a correct criterion of truth – one that successfully discriminates true propositions from false ones) that does not depend on an answer to (1) and use the answer to (2) to answer (1).

Since methodism begins by assuming that some criterion is a correct criterion of truth without providing any epistemic reason to prefer this response to the alternatives, it begs the question against particularism and skepticism.

The skeptical response to the Problem of the Criterion assumes that both particularism and methodism are mistaken. That is, the skeptical response to the Problem of the Criterion assumes that there is no answer to (1) that does not depend on an answer to (2) and there is no answer to (2) that does not depend on an answer to (1). As Chisholm (1973, 14) explains the response:

And so we can formulate the position of the skeptic on these matters. He will say: ‘You cannot answer question 1 until you answer question 2. And you cannot answer question 2 until you answer question 1. Therefore, you cannot answer either question. You cannot know what, if anything, you know, and there is no possible way for you to decide in any particular case.’

(The names of the questions have been changed from Chisholm’s “A” and “B” to “1” and “2”, respectively, in this quote in order to maintain continuity with the present discussion)

A bit more succinctly:

Skepticism Assume that (i) there is no independent answer to (1) or (2), and (ii) if (1) and (2) cannot be answered independently, they cannot be answered at all.

According to Chisholm, the skeptical response has no more to recommend it that particularism or methodism. The reason for this is that skepticism, as a response to the Problem of the Criterion, is question-begging. The skeptic simply assumes that there is no independent answer to (1) or (2) and though both the particularist and the methodist deny this assumption, they can only respond by appealing to assumptions of their own. The skeptic has no reasons to support the assumption that there is no independent answer to (1) or (2). The conflict between the three responses that Chisholm considers comes down to ungrounded assumptions. It is because of this fact that Chisholm claims when facing the Problem of the Criterion we have no choice but to beg the question. Since all responses beg the question, skepticism is no better off than any other response to the Problem of the Criterion.

At this point it is worth getting clear on two further points about the skeptical response. First, it should be noted that the skeptical response is not the only response that might lead to a thoroughgoing skepticism. For instance, one might be a methodist who assumes the criterion for distinguishing true from false propositions is absolute certainty. That is, a methodist might think that the only way to tell whether a proposition is true is for the truth of the proposition to be absolutely certain for her. Pretty clearly this sort of methodism will lead to a fairly extreme skepticism. One of the lessons of Cartesian skepticism is that it is implausible to think that we can be absolutely certain about the truth of any proposition about the external world.

Second, one might think that the skeptical response to the Problem of the Criterion really is better off than particularism or methodism. One might think that the skeptical response simply emerges from consideration of the problems facing both particularism and methodism, and so does not have to make any assumptions of its own.

Although the skeptical response may arise in this way, it does not absolve skepticism of begging the question. As Chisholm notes, the skeptical response has nothing in itself that makes it better than particularism or methodism. The skeptical resolution of the Problem of the Criterion has nothing to appeal to other than unfounded assumptions in order to motivate it over its alternatives. Without something more than unfounded assumptions there does not seem to be any reason to prefer the skeptic’s response. Given this, accepting the skeptical response would still beg the question because there is no more reason to accept it than there is to accept any of the other positions. Since the skeptical response has nothing more to recommend it in itself than the other responses, there is no more reason to accept the skeptical response because of the problems for particularism and methodism than there is to accept particularism because of the problems with the other responses, or to accept methodism for the same reason. All three options seem to be on equal footing when it comes to having reason to pick them over their rivals and they all beg the question.

Each of these responses to the Problem of the Criterion begins with an unfounded assumption, one that is unsupported by reasons, and so begs the question in an epistemic sense against the other two. Despite this and his emphasis on the fact that all three responses are unappealing because of their question-begging, Chisholm famously argues in support of particularism. His argument in support of particularism, which he sometimes refers to as “commonsensism”, involves criticizing the other two responses and giving some reasons for preferring particularism.

Concerning methodism, Chisholm offers two objections. First, he objects that the criterion that methodism starts with will be “very broad and far-reaching and at the same time completely arbitrary” (1973, 17). Essentially, he thinks that there can be no good reason for starting with a broad criterion. Second, he objects that methodism (at least of the empiricist variety that he considers in detail) will lead to skepticism. When we adopt the methodist’s broad criterion it will turn out that many of the things we commonsensically take ourselves to know do not count as knowledge. Chisholm finds this unacceptable.

Chisholm’s case against the skeptical response to the Problem of the Criterion seems also to come down to two things. The first is quite plain. If methodism is flawed because it will lead to skepticism concerning many areas where we take ourselves to have knowledge, it is no surprise that Chisholm finds the skeptical response to the Problem of the Criterion to be unacceptable. It too has this result. In fact, the skeptical response is in a sense doubly skeptical. It not only holds that we lack knowledge in areas that we typically take ourselves to have knowledge, it holds that we cannot even begin the process of determining what we do know. The second problem Chisholm seems to have with skepticism is simply that it has no more to recommend it that either of the other views. Admittedly, this does not seem to be much of a criticism, especially since he grants that all three responses make unfounded assumptions.

Unfortunately, Chisholm’s positive support for particularism is very sparse. In fact in his Aquinas Lectures he only claims, “in favor of our approach [particularism] there is the fact that we do know many things, after all” (1973, 38). But, of course, this seems to merely be a statement of the assumption made by particularism, not a defense of it. As a solution to the Problem of the Criterion, Chisholm’s particularism seems to be lacking. In fact, Robert Amico (1988b) argues that Chisholm’s “solution” is clearly unacceptable because Chisholm does not give us good independent reasons to reject either methodism or skepticism, he does not provide good reasons to prefer particularism to the other responses, and, as Chisholm himself admits, particularism begs the question.

Given the very weak argument in support of his preferred view, one might wonder what Chisholm is really up to when he discusses the Problem of the Criterion. Throughout the many works in which he discusses the Problem of the Criterion Chisholm consistently favors particularism, but he also makes it clear that all responses to the Problem of the Criterion are unappealing and his own view must, just like its rivals, beg the question. In responses to Amico’s criticisms Chisholm claims that particularism is superior to methodism and skepticism because by being a particularist one can give a reasonable account of knowledge, but one cannot make progress in epistemology by taking a methodist or skeptical approach.

A few further points about Chisholm’s take on the Problem of the Criterion that are often overlooked are worth mentioning here. First, he claims that we should remain open to the possibility of one day discovering a version of methodism that fares better than the empiricist version he criticizes. Second, Chisholm is adamant that in supporting particularism he is not trying to solve the Problem of the Criterion because “the problem of the criterion has no solution” (1988, 234). So, Chisholm thinks that particularism is simply the best of a set of bad options—the options are bad because they beg the question; particularism is best because it allows us to make progress in epistemology.

3. Other Responses to the Problem of the Criterion

Chisholm claimed that there are only three responses to the Problem of the Criterion and that there is no solution to this problem. Many philosophers disagree with Chisholm on both points. In fact, Andrew Cling (1994) argues that there are eight non-skeptical responses to the Problem of the Criterion. Importantly, Cling does not consider two of the non-skeptical responses that we will consider below, so it seems that if Cling is correct concerning the eight non-skeptical approaches he mentions and the two additional approaches discussed below are distinct responses, there are at least eleven (ten non-skeptical and one skeptical) responses to the Problem of the Criterion. While there are many possible responses to the Problem of the Criterion the focus here will be limited to those that have been defended in the literature.

a. Explanationist Responses

As noted above, there are a number of responses to the Problem of the Criterion beyond the three kinds that Chisholm considers. The employment of explanatory reasoning offers promising alternatives to the responses Chisholm considers. These explanationist responses share a commitment to explanatory reasoning—they all involve attempting to answer (1) and (2) in a way that yields the most satisfactory explanatory picture. A helpful way of understanding explanationist responses is as employing the method of reflective equilibrium to respond to the Problem of the Criterion. Roughly, the method of reflective equilibrium involves starting with a set of data (beliefs, intuitions, etc.) and making revisions to that set—giving up some of the data, adding new data to the set, giving more/less weight to some of the data, and so on—so as to create the best explanatory picture overall. Reaching this equilibrium state of maximized explanatory coherence of the remaining data is thought to make accepting whatever data remains, whether this includes any of one’s initial data or not, reasonable (see coherentism and John Rawls for more on reflective equilibrium). Of course, there have been criticisms of the viability of reflective equilibrium as a method of reasoning; however, for current purposes these can be set aside because the ultimate concern here is simply the sort of responses that can be generated by employing reflective equilibrium.

There are a variety of ways that one might attempt to respond to the Problem of the Criterion by using the method of reflective equilibrium. The variation in these responses is largely a result of what one includes in the set of data that will form the basis for one’s reflection. It is worth considering some of the more promising varieties of this response that have been put forward in the literature.

i. Explanatory Particularism

Although the explanatory particularism defended by Paul Moser (1989) is a kind of particularism, its explanationist elements warrant discussing it as a separate variety of response. Moser’s (1989, 261) explanatory particularism begins with one’s “considered, but revisable, judgments” concerning particular propositions. This is importantly different from the sort of particularism that Chisholm describes because explanatory particularism allows that the beliefs about the truth of particular propositions are revisable whereas particularism as Chisholm describes it does not clearly allow for this. It is because of this that Moser claims that explanatory particularism does not beg the question against skeptics by ruling out skepticism from the start. Importantly, the kind of skepticism Moser is discussing here is not the skeptical response to the Problem of the Criterion, but rather the sort of skepticism that grants that we can get started in epistemological theorizing while claiming that ultimately we will end up lacking knowledge in a wide range of cases. External world skepticism is an example of this sort of skepticism; it grants that we are aware of what is required for knowledge, but claims that we simply fail to have knowledge of the world around us. Like Chisholm’s particularism, explanatory particularism uses this initial set of propositions (i.e. this answer to (1)) to develop epistemic principles or criteria for truth (i.e. to answer (2)). The initial set of propositions and criteria are both continually revised until a state of maximal explanatory coherence is reached.

Moser claims that explanatory particularism avoids begging the question in the way that Chisholm’s particularism or methodism does. The reason for this is that Moser claims that the beliefs that explanatory particularism starts with are revisable. Despite this and Moser’s claim that explanatory particularism does not beg the question against the skeptic, it is not clear that it avoids begging the question against the skeptical response to the Problem of the Criterion. After all, explanatory particularism assumes an independent answer to (1)—revisable or not it is still an answer—and then uses that to answer (2). So, it at least seems that explanatory particularism begs the question against the skeptical response by denying the skeptic’s assumption that there is no independent answer to (1) or (2).

Ernest Sosa (2009) also defends a view that we might call explanatory particularism. On Sosa’s view we begin with particular items of knowledge. That is, we start with particular propositions that we know to be true. According to Sosa, we know these propositions because our beliefs with respect to these propositions satisfy a correct general criterion of knowledge (they are formed by sufficiently reliable cognitive faculties). Although we have knowledge of these propositions, we merely have what he terms “animal knowledge”. Our knowledge of these propositions when we begin is only animal knowledge because we lack a higher-order perspective on these beliefs. That is to say, we lack “reflective knowledge” of the fact that these first-order beliefs satisfy the proper criterion of knowledge. However, on Sosa’s view we use our animal knowledge to develop a perspective on our epistemic situation that offers us an explanatory picture about how or why our first-order beliefs really do constitute knowledge, i.e. we develop reflective knowledge as to how our particular pieces of animal knowledge satisfy the proper criterion for knowledge. This explanatory perspective yields reflective knowledge and it strengthens our animal knowledge. A significant component of this picture is that we use our starting animal knowledge to come to answer both (1) and (2) from a reflective standpoint. So, we begin with an answer to (1) in terms of animal knowledge and we use that answer to develop a perspective that gives us an answer to (2) with respect to both animal and reflective knowledge and an answer to (1) in terms of reflective knowledge. Sosa’s explanatory response to the Problem of the Criterion relies on a mixture of levels.

Although Sosa’s explanatory particularism with its multiple levels seems more complex that Moser’s, one might think that it begs the question in the same way that Moser’s response does. Namely, Sosa’s response, like Moser’s, assumes an independent answer to (1). Sosa’s assumed answer is only in terms of animal knowledge, however, it is still an answer. His response then requires using that answer to develop an explanatory perspective that provides an answer to (2). Thus, one might think that Sosa’s response seems to beg the question against the skeptical response in the same way that Moser’s response seems to: by denying the skeptic’s assumption (i) that there is no independent answer to (1) or (2).

ii. Coherentism

Coherentism responds to the Problem of the Criterion by starting with both beliefs about which propositions are true and beliefs about the correct method or methods for telling which beliefs are true. It then uses these beliefs to attempt to answer both (1) and (2) at the same time (DePaul 1988 & 2009, Cling 1994, and Poston 2011). As Andrew Cling (1994, 274) explains:

To be a coherentist is to reject the epistemic priority of beliefs and criteria of truth. Instead, coherentists recommend balancing beliefs against criteria and criteria against beliefs until they all form a consistent, mutually supporting system.

The coherentist does not simply assume that the criterion of truth is to balance “beliefs against criteria and criteria against beliefs.” To understand coherentism in this way would simply make it a variety of methodism, and so fail to appreciate the importance of its employment of reflective equilibrium. Instead, the coherentist response involves starting with both beliefs about criteria of truth and also beliefs that particular propositions are true and then makes adjustments to beliefs of either kind in an attempt to reach a state of reflective equilibrium. Once this equilibrium state has been reached the coherentist uses it to complete her answers to (1) and (2).

On one understanding of coherentism (Cling’s 1994 and Poston’s 2011) the coherentist accepts one of the skeptic’s assumptions, but denies the other. In particular, this version of coherentism shares the skeptic’s assumption of (i) (there is no independent answer to (1) or (2)), but denies (ii) (if (1) and (2) cannot be answered independently, they cannot be answered at all). More precisely, on this way of understanding coherentism it involves accepting (i) of Skepticism and adding to it the further assumptions that: (a) a particular criterion is correct (namely, explanatory goodness), (b) a set of particular propositions are true, and (c) the criterion and the set of propositions are not independent of each other. However, it seems that if one begins with beliefs about which propositions are true and beliefs about the correct criteria for telling which beliefs are true along with the assumption that there is no independent answer to (1) or (2), this version of coherentism will beg the question for reasons similar to why Skepticism begs the question. That is to say, the coherentist’s assumption of (i) begs the question against particularism and methodism. After all, (i) is a groundless assumption with which the coherentist starts. It may be awareness of this feature that helped lead Cling (2009) to ultimately abandon his coherentist response in favor of a skeptical stance with respect to the Problem of the Criterion.

Another way of understanding this approach is as Michael DePaul (1988 & 2009) depicts it. According to this way of understanding coherentism, the coherentist starts with beliefs about which particular propositions are true and about the correct criteria for telling which beliefs are true, but she does not assume (i). This version of coherentism seems to avoid begging the question against both particularists and methodists because it does not assume that we can answer (1) prior to (2) or that we can answer (2) prior to (1) nor does it assume that they cannot be answered independently. Instead, this kind of coherentism merely applies reflective equilibrium to the coherentist’s starting set of beliefs without taking a stand on (i) at all. Now it might turn out that after the application of reflective equilibrium the coherentist is committed to a particular position with respect to (i), but this kind of coherentism does not have to take a stand on (i) from the start. So, in some respects this way of understanding coherentism may seem superior to the previous version of coherentism. However, its use of beliefs in the relevant data set seems to beg the question against the skeptic because starting with beliefs about which propositions are true assumes that we can answer and in fact already have an answer to (1). Likewise, a belief about which criteria are successful for telling which beliefs are true assumes that we can answer and have an answer to (2). In other words, this version of coherentism seems to beg the question against skepticism by assuming that (ii) is false. Thus, applying reflective equilibrium to a set of beliefs appears to beg the question by assuming that one of the assumptions of skepticism is false from the outset. This may be why DePaul (2009) accepts Chisholm’s position that all responses to the Problem of the Criterion end up begging the question.

A final coherentist response is Nicholas Rescher’s “systems-theoretic approach”. Rescher’s development of this approach takes place over several books (1973a, 1973b, 1977, and 1980). Although Rescher’s systems-theoretic approach is complex, the relevant details for the present discussion of the Problem of the Criterion are relatively straightforward. Rescher’s response begins by appealing to pragmatic considerations. It starts with a method and a goal, applies the method and checks to see whether the results satisfy the goal. So, with respect to the Problem of the Criterion the idea is that our goal is to come to believe true propositions and we start with some criterion for distinguishing true propositions from false. We apply our criterion and then see if it helps us achieve our goal. Assuming that the criterion does help us achieve our goal, we have completed the first step in Rescher’s process. The second step in this process involves showing that a pragmatically successful criterion/method is connected to the truth. Here Rescher (1977, 107) claims that “only when all the pieces fit together” do we have justification for the criterion. Further, he is clear that coherence is central to this process. It is because of this that Robert Amico (1993) argues that Rescher’s view, though complex, is simply a coherentist version of methodism—Rescher ultimately assumes that coherence is the appropriate criterion of truth. This is so despite the fact that the criterion/method that Rescher starts with may not be coherence because ultimately his way of establishing that any criterion that one starts with is actually a correct criterion is by appeal to coherence. Since Rescher assumes this role for coherence from the outset, his approach seems to be a form of methodism. Although Rescher’s approach is a kind of methodism with a significant explanatory element and one that may make more progress in epistemology than the sort that Chisholm criticizes, it seems to be vulnerable to the same charge of question-begging that Chisholm leveled at other forms of methodism—something Rescher may accept since he does not believe the Problem of the Criterion can be solved, but it is at best something that one can “meet and overcome” (1980, 13).

iii. Applied Evidentialism

A final explanationist response to the Problem of the Criterion is what Earl Conee (2004) calls “Applied Evidentialism” (McCain and Rowley (2014) call it the “Seeming Intuition Response”). This explanationist response differs from the previous ways of using reflective equilibrium to respond to the Problem of the Criterion in that it does not start with a set of beliefs. Rather, Applied Evidentialism begins with one’s evidence. In particular, when Conee defends this view he suggests beginning with the set of intuitions or what seems true to us about various propositions. That is to say, Applied Evidentialism begins with what seems true to us both with respect to propositions about particular items of fact and with respect to criteria for determining when propositions are true. According to Applied Evidentialism, the way to respond to the Problem of the Criterion is to start with these intuitions and then make modifications—give up some intuitions, form different intuitions, rank some intuitions as more/less important than others, and so on— until a state of equilibrium has been reached. Once such an equilibrium state has been reached the data from that state can be used to answer (1) and (2).

Like the other ways of using reflective equilibrium to respond to the Problem of the Criterion, Applied Evidentialism does not seem to beg the question against particularism or methodism because it does not assume that there can be no independent answer to (1) or (2). Additionally, Applied Evidentialism does not seem to beg the question against the skeptic because it refrains from assuming an answer to (1) or (2) at the outset. Further, Applied Evidentialism does not assume from the start that the equilibrium state that we end up with will be anti-skeptical. It is consistent with Applied Evidentialism that reflection on our initial intuitions will in the end lead us to the conclusion that we are unaware of which propositions are true or that we lack an appropriate criterion for discovering this information. In other words, Applied Evidentialism does not assume that we will have an answer to (1) or (2) when we reach our end equilibrium state. After all, it could be that our equilibrium state is one in which no method appears to be correct and our best position with respect to each proposition seems to be to suspend judgment concerning its truth. So, Applied Evidentialism does not seem to beg any questions against the skeptical response to the Problem of the Criterion or other kinds of skepticism, such as Cartesian skepticism.

One might worry that Applied Evidentialism is really a form of methodism, and hence, open to the same charge of question begging as other kinds of methodism. After all, Applied Evidentialism suggests that using the method of reflective equilibrium on one’s intuitions can provide a response to the Problem of the Criterion.

Upon reflection, however, it seems that Applied Evidentialism is not a kind of methodism. Plausibly, someone can employ a method without having any beliefs about, or even conscious awareness of, the method at all. Kevin McCain and William Rowley (2014) argue that methods are analogous to rules in this sense. They maintain that someone might behave in accordance with a rule without intending to obey the rule or even being aware that there is such a rule at all. For example, one can act in accordance with a rule of not driving faster than 50mph by simply not driving over 50mph. She does not need to know that this is a rule or even have an intention to follow rules concerning speed limits. Ignorance of a rule does not mean that one fails to act in accordance with a rule. Likewise, McCain and Rowley claim, one can employ the method of reflective equilibrium without accepting or even being aware of the method being used. So, Applied Evidentialism does not seem to be a kind of methodism.

McCain and Rowley further argue that Applied Evidentialism does not beg the question by assuming that reflective equilibrium is the correct criterion or method at the outset. They maintain that this is not to say that one cannot be aware that reflective equilibrium is a good method from the outset. Rather, they claim that the important point is that Applied Evidentialism does not take the goodness of reflective equilibrium as a starting assumption—perhaps one has the intuition that reflective equilibrium is a good method to employ, perhaps not. The key, they argue, is that unlike methodism Applied Evidentialism does not require one to have beliefs about, or even awareness of, reflective equilibrium to begin to respond to the Problem of the Criterion. So, they argue Applied Evidentialism is not a form of methodism. And thus, Applied Evidentialism does not beg the questions that methodism does.

Even if one accepts that Applied Evidentialism does not beg the question, it may have other problems. It seems that in order to avoid begging the question Applied Evidentialism requires being able to employ reflective equilibrium in responding to the Problem of the Criterion without needing reasons to think that reflective equilibrium is a good method from the start. This, however, seems to commit the supporter of Applied Evidentialism to accepting that certain kinds of circular reasoning can provide one with good reasons. More precisely, if Applied Evidentialism is to avoid being a form of methodism, and the question begging that comes with methodism, then it seems that Applied Evidentialism requires that one can have good reasons to believe the results of employing reflective equilibrium without first having good reasons to accept reflective equilibrium as a good method. But, this allows for epistemic circularity because it can be the case that the claim that reflective equilibrium is a good method is itself one of the results that is produced in the final equilibrium state. The heart of this worry is that Applied Evidentialism allows someone to use reflective equilibrium to come to reasonably believe that reflective equilibrium is a good method for determining true propositions. This is a kind of rule-circularity that occurs when a rule or method is employed to establish that that very rule or method is acceptable. The status of rule-circularity is contentious. Several authors argue that it is benign (for example, Braithwaite (1953), Conee (2004), Matheson (2012), Sosa (2009), and Van Cleve (1984)), but others argue that it is vicious circularity (e.g., Cling (2003) and Vogel (2008)). Depending on whether this circularity is benign or vicious, Applied Evidentialism is a promising or problematic response to the Problem of the Criterion (for more on this issue see epistemic circularity).

b. Dissolution

Robert Amico (1988a, 1993, and 1996) offers a very different response to the Problem of the Criterion. Rather than attempting to solve the Problem of the Criterion, Amico attempts to “dissolve” it. According to Amico, a philosophical problem is a question that can only be answered theoretically—it cannot be answered by purely empirical investigation. Further, a philosophical problem is such that there is rational doubt as to the correct answer to the question asked by the problem. He explains rational doubt as simply being such that withholding belief in a particular answer is the justified doxastic attitude to take. Since he explicates philosophical problems in terms of rational doubt and rational doubt is relative to a person, Amico maintains that problems are always relative to particular people. A particular question poses a problem for someone when that question generates rational doubt for her.

It is because of the role of rational doubt that Amico distinguishes between solutions to problems and dissolutions of problems. A solution to a problem is a set of true statements that answer the question that generates the problem and removes the rational doubt concerning the answer to the question. Dissolution occurs when the rational doubt is removed, not by an answer to the question, but rather by recognition that it is impossible to adequately answer the question. For example, Amico claims that the problem of how to square a circle is dissolved as soon as one recognizes that it is impossible to make a circular square. Once someone sees that it is impossible to make a circular square, the question “How do you square a circle?” does not generate any rational doubt for her. Without rational doubt, Amico claims that the problem has been dissolved and there is no need to look for a solution.

Like all problems, Amico claims that the Problem of the Criterion is only a problem for a particular person when its question raises rational doubt for the person. When we first consider the questions posed by the Problem of the Criterion Amico claims that we may have rational doubt about how to answer the questions in such a way that that answer can be justified to the skeptic. So, we face a problem. However, Amico argues that consideration of the failure of other responses—in particular their tendency to be question begging— and consideration of the nature of the problem itself allows one to recognize that it is in fact impossible to answer the questions of the Problem of the Criterion in a way that can be justified to the skeptic. Once one recognizes that it is impossible to answer the skeptic’s questions Amico alleges that the rational doubt generated by the Problem of the Criterion is removed. Thus, he claims that the Problem of the Criterion is at that point dissolved. Since it has been dissolved, we should not be troubled by the Problem of the Criterion at all.

There are three major challenges to Amico’s purported dissolution of the Problem of the Criterion. The first, as Sharon Ryan (1996) argues, is that it does not seem that the problem has been dissolved, but instead it seems that Amico has simply accepted that the skeptic is correct. Amico responds by claiming that the skeptical position is not a solution to the problem because that position cannot be justified to the particularist or the methodist. Since none of the three positions can justify their position to the others, he claims that the problem is dissolved. It is not clear that this adequately responds to Ryan’s criticism because one might think that claiming that there is no acceptable answer to the questions of the Problem of the Criterion is exactly what the skeptic had in mind all along.

The second major challenge to Amico’s view comes from the various responses to the Problem of the Criterion. Although he does discuss several responses, Amico does not argue that all of the responses mentioned above fail to provide answers that remove the rational doubt raised by the Problem of the Criterion. Insofar as one thinks that some of these responses to the Problem of the Criterion provide a solution to the problem, one will rightly be skeptical of Amico’s proffered dissolution.

The third major challenge to Amico’s view arises because he seems to rest his dissolution on what can and cannot be said in response to a skeptic. Andrew Cling argues that the Problem of the Criterion does not require skeptical interlocutors at all. Rather, Cling maintains that the difficulty illuminated by the Problem of the Criterion is that anti-skeptics have commitments that seem plausible when considered individually, but they are jointly inconsistent. The inconsistency among these commitments is present whether or not there are skeptics. Thus, Cling contends that arguing that the Problem of the Criterion is constituted by questions that cannot be answered does not dissolve the problem; it brings the problem to light.

4. The Problem of the Criterion’s Relation to Other Philosophical Problems

The Problem of the Criterion is a significant philosophical issue in its own right—if Chisholm is correct, it is one of the most fundamental of all philosophical problems. However, according to many philosophers, there are additional reasons to study this problem. They claim that the Problem of the Criterion is closely related to several other perennial problems of philosophy. It is worth briefly noting some of the philosophical problems thought to be closely related to the Problem of the Criterion.

First, James Van Cleve (1979) and Ernest Sosa (2007) maintain that the Cartesian Circle is in fact just a special instance of the Problem of the Criterion (See Descartes for more on the Cartesian Circle). Sosa also argues that the problem of easy knowledge is closely related to the Problem of the Criterion—something that Stewart Cohen (2002) and Andrew Cling note as well. In places Sosa seems to go so far as to suggest that the problem of easy knowledge and the Problem of the Criterion are the same problem. (See epistemic circularity for more on the problem of easy knowledge).

Next, Ruth Weintraub (1995) argues that Hume’s attack on induction is simply a special case of the Problem of the Criterion. She claims that Hume essentially applies the Problem of the Criterion to induction rather than applying the problem in a general fashion (For more on Humean inductive skepticism see confirmation and induction, epistemology, and Hume: causation).

According to Bryson Brown (2006), the challenge of responding to skepticism about the past is just a version of the Problem of the Criterion. He claims that debunking Bertrand Russell’s five-minute old universe hypothesis, for example, involves providing a criterion for trusting memory. This, he argues, requires satisfactorily responding to the Problem of the Criterion.

Andrew Cling (2009) and (2014) maintains that the Problem of the Criterion and the regress argument for skepticism are closely related. In fact, he argues that they are both instances of a more general problem that he calls the “paradox of reasons”. Cling argues that this paradox arises because it seems that it is possible to have reasons for a belief, it seems that reasons themselves must be supported by reasons, and it seems that if an endless sequence of reasons—either in the form of an infinite regress or a circle of reasons—is necessary for having reasons for a belief, then it is impossible to have reasons for a belief. According to Cling, these three commitments are inconsistent. The important point for the current purpose is that Cling maintains that the Problem of the Criterion and the regress argument for skepticism are both instances of the paradox of reasons (See infinitism in epistemology for more on regress arguments).

Finally, Howard Sankey (2010, 2011, and 2012) argues that the Problem of the Criterion provides one of the primary, if not the primary, argument in support of epistemic relativism. Relativists take the Problem of the Criterion to show that it is not possible to provide a justification for choosing one criterion over another. However, rather than opting for skepticism, which claims that no criterion is justified, relativists respond to the Problem of the Criterion by holding that all criteria are equally rational to adopt—one’s choice is determined simply by the context in which one finds oneself. Sankey argues that a clear understanding of the Problem of the Criterion is key to responding to the threat of epistemic relativism (For more on epistemic relativism see relativism).

The Problem of the Criterion is a significant philosophical problem in its own right. However, if these philosophers are correct in claiming that the Problem of the Criterion is related to all of these various philosophical problems in important ways, close study of this problem and its responses could yield insights that are very far-ranging.

5. References and Further Reading

Aikin, S.F. “The Problem of the Criterion and a Hegelian Model for Epistemic Infinitism.” History of Philosophy Quarterly 27 (2010): 379-88.
- Puts forward the view that Hegel proposes what is arguably a coherentist response to the Problem of the Criterion.
Amico, R. P. “Reply to Chisholm on the Problem of the Criterion.” Philosophical Papers 17 (1988a): 235-36.
- Presents a very brief formulation of his dissolution of the Problem of the Criterion.
Amico, R. P. “Roderick Chisholm and the Problem of the Criterion.” Philosophical Papers 17 (1988b): 217-29.
- Argues that Chisholm’s particularist response to the Problem of the Criterion is unsatisfactory.
Amico, R. P. The Problem of the Criterion. Lanham, MD: Rowman & Littlefield Publishers, Inc., 1993.
- The only book-length treatment of the Problem of the Criterion. Includes a helpful discussion of the history of the Problem of the Criterion, critiques of major responses to the Problem of the Criterion, and the full formulation of Amico’s proposed dissolution.
Amico, R. P. “Skepticism and the Problem of the Criterion.” In K. G. Lucey (ed.), On Knowing and the Known. Amherst, NY: Prometheus Books, 1996. 132-41.
- Argues against the skeptical response to the Problem of the Criterion in favor of his dissolution of the problem.
Braithwaite, R.B. Scientific Explanation. Cambridge: Cambridge University Press, 1953.
- Argues that the sort of rule-circularity present in inductive arguments in support of induction is not always vicious.
Brown, B. “Skepticism About the Past and the Problem of the Criterion.” Croatian Journal of Philosophy 6 (2006): 291-306.
- Argues that skepticism about the past is in essence a limited form of the Problem of the Criterion.
Chisholm, R.M. Perceiving. Ithaca, NY: Cornell University Press, 1957.
- Chisholm’s earliest discussion of the Problem of the Criterion appears in this work.
Chisholm, R.M. The Problem of the Criterion. Milwaukee, WI: Marquette University Press, 1973.
- The Aquinas Lecture on the Problem of the Criterion by one of the most influential epistemologists of the twentieth century. Arguably, this is the most important contemporary work on the Problem of the Criterion.
Chisholm, R.M. Theory of Knowledge. Englewood Cliffs, NJ: Prentice Hall, 2^nd Edition, 1977; 3^rd Edition, 1989.
- Chisholm’s famous and widely used epistemology textbook; contains brief discussions of the Problem of the Criterion in both of its later editions.
Chisholm, R.M. The Foundations of Knowing. Minneapolis, MN: University of Minnesota Press, 1982.
- Contains a reprint of Chisholm’s 1973 Aquinas Lecture.
Chisholm, R.M. “Reply to Amico on the Problem of the Criterion.” Philosophical Papers 17 (1988): 231-34.
- Responds to Amico’s criticisms of his particularist response to the Problem of the Criterion. Claims that the Problem of the Criterion cannot be solved.
Cling, A.D. “Posing the Problem of the Criterion.” Philosophical Studies 75 (1994): 261-92.
- Argues that there are many more options for responding to the Problem of the Criterion than Chisholm considers. Presents his coherentist response to the Problem of the Criterion.
Cling, A.D. “Epistemic Levels and the Problem of the Criterion.” Philosophical Studies 88 (1997): 109-40.
- Presents the Problem of the Criterion as an argument for skepticism. Argues that both Chisholm and Van Cleve fail to solve the problem.
Cling, A.D. “Self-Supporting Arguments.” Philosophy and Phenomenological Research 66 (2003): 279-303.
- Evaluates the strength of self-supporting arguments in deductive and inductive logic. Argues that rule-circularity is a kind of vicious circularity.
Cling, A.D. “Reasons, Regresses, and Tragedy: The Epistemic Regress Problem and the Problem of the Criterion.” American Philosophical Quarterly 46 (2009): 333-46.
- Argues that the Problem of the Criterion and the regress argument for skepticism are both species of a more general problem, the “paradox of reasons”
Cling, A.D. “The Epistemic Regress Problem, the Problem of the Criterion, and the Value of Reasons.” Metaphilosophy 45 (2014): 161-71.
- Further develops the idea that the Problem of the Criterion and the regress argument for skepticism are both species of a more general problem, the “paradox of reasons”. Also, includes a discussion of the kinds of reasons that this problem reveals we can and cannot have.
Coffey, P. Epistemology or Theory of Knowledge. London: Longmans, Green, 1917.
- This work by D.J. Mercier’s pupil is largely responsible for ushering discussion of the Problem of the Criterion into the 20^th century.
Cohen, S. “Basic Knowledge and the Problem of Easy Knowledge.” Philosophy and Phenomenological Research 65 (2002): 309-29.
- Presents the problem of easy knowledge and notes its relevance to the Problem of the Criterion.
Conee, E. “First Things First.” In E. Conee and R. Feldman, Evidentialism. New York: Oxford University Press, 2004. 11-36.
- Presents and defends “Applied Evidentialism” as a response to the Problem of the Criterion.
DePaul, M. “The Problem of the Criterion and Coherence Methods in Ethics.” Canadian Journal of Philosophy 18 (1988): 67-86.
- Presents a version of the Problem of the Criterion in terms of moral theories and describes his coherentist response to the Problem of the Criterion.
DePaul, M. “Pyrrhonian Moral Skepticism and the Problem of the Criterion.” Philosophical Issues 19 (2009), 38-56.
- Claims, like Chisholm, that all responses to the Problem of the Criterion—including the skeptical response—beg the question.
DePaul, M. “Sosa, Certainty and the Problem of the Criterion.” Philosophical Papers 40 (2011), 287-304.
- Suggests that Chisholm’s own particularist response to the Problem of the Criterion may have included some subtle methodism. Also, provides a discussion of Sosa’s recent work on the Problem of the Criterion.
Fumerton, R. “The Problem of the Criterion.” In J. Greco (ed.), The Oxford Handbook of Skepticism. Oxford: Oxford University Press, 2008. 34-52.
- Claims there are at least two distinct problems often called the “Problem of the Criterion”. Also, discusses some responses to the Problem of the Criterion.
Greco, J. “Epistemic Circularity: Vicious, Virtuous and Benign.” International Journal for the Study of Skepticism 1 (2011): 1-8.
- Provides a nice summary of Sosa’s most recent work on the Problem of the Criterion.
Hegel, G.W.F. Phenomenology of Spirit. Oxford: Oxford University Press, 1979.
- Helped draw attention back to the Problem of the Criterion in the 19^th century. Presents the Problem of the Criterion as a crisis for Spirit, and (arguably) proposes a coherentist response to the problem.
Lemos, Noah. Commonsense: A Contemporary Defense. New York: Cambridge University Press, 2004.
- Defends Chisholm’s particularist response to the Problem of the Criterion.
Matheson, J. “Epistemic Relativism.” In A. Cullison (ed.), Continuum Companion to Epistemology. New York: Continuum, 2012. 161-79.
- Argues against epistemic relativism and offers considerations for thinking that at least some kinds of epistemic circularity are not vicious.
Mercier, D.J. Criteriologie 8^th Edition. Paris: Felix Alcan, 1923.
- Helped draw attention back to the Problem of the Criterion in the 19^th century. Also, Chisholm cites Mercier’s conditions for what a satisfying criterion would have to look like.
McCain, K. and Rowley, W. “Pick Your Poison: Beg the Question or Embrace Circularity.” International Journal for the Study of Skepticism (2014): 125-40.
- Explains why the three responses to the Problem of the Criterion that Chisholm considers each beg the question. Also, argues that it is possible to respond to the Problem of the Criterion without begging the question, but doing so requires a commitment to certain forms of circularity as epistemically acceptable.
Montaigne, M. de. “Apology for Raymond Sebond.” In J. Zeitlin (trans.), Essays of Michael De Montaigne, New York: Knopf, 1935.
- The Problem of the Criterion appears to have resurfaced in the modern period with this work.
Moser, P.K. Knowledge and Evidence. Cambridge: Cambridge University Press, 1989.
- Presents and defends his explanatory particularist response to the Problem of the Criterion.
Popkin, R.H. The History of Sceptism: From Savonarola to Bayle (Revised and Expanded Edition). New York: Oxford University Press, 2003.
- Discusses the historical development of skepticism. Of particular interest is the discussion of the influence that the Problem of the Criterion had on philosophy during the modern period.
Poston, T. “Explanationist Plasticity & The Problem of the Criterion.” Philosophical Papers 40 (2011): 395-419.
- Defends a coherentist response to the Problem of the Criterion.
Rescher, N. The Coherence Theory of Truth. Oxford: Clarendon Press, 1973a.
- Part of the series of books in which Rescher’s “systems-theoretic approach” to the Problem of the Criterion is developed.
Rescher, N. The Primacy of Practice. Oxford: Basil Blackwell, 1973b.
- Part of the series of books in which Rescher’s “systems-theoretic approach” to the Problem of the Criterion is developed.
Rescher, N. Methodological Pragmatism. Oxford: Basil Blackwell, 1977.
- Part of the series of books in which Rescher’s “systems-theoretic approach” to the Problem of the Criterion is developed.
Rescher, N. Scepticism. Totowa, N.J.: Rowman & Littlefield Publishers, 1980.
- Part of the series of books in which Rescher’s “systems-theoretic approach” to the Problem of the Criterion is developed.
Rockmore, T. “Hegel and Epistemological Constructivism.” Idealistic Studies 36 (2006): 183-90.
- Argues that Hegel proposes a coherentist response to the Problem of the Criterion.
Ryan, S. “Reply to Amico on Skepticism and the Problem of the Criterion.” In K. G. Lucey (ed.), On Knowing and the Known. Amherst, NY: Prometheus Books, 1996. 142-48.
- Argues that Amico’s dissolution of the Problem of the Criterion really amounts to accepting the skeptical response to the Problem of the Criterion.
Sankey, H. “Witchcraft, Relativism and the Problem of the Criterion.” Erkenntnis 72 (2010): 1-16.
- Explores the relationship between epistemic relativism and the Problem of the Criterion.
Sankey, H. “Epistemic Relativism and the Problem of the Criterion.” Studies in the History and Philosophy of Science 42 (2011): 562-70.
- Explores the relationship between epistemic relativism and the Problem of the Criterion.
Sankey, H. “Scepticism, Relativism, and the Argument from the Criterion.” Studies in the History and Philosophy of Science 43 (2012): 182-90.
- Explores the relationship between epistemic relativism and the Problem of the Criterion.
Sextus Empiricus. The Skeptic Way: Sextus Empiricus’s Outlines of Pyrrhonism, (trans.) B. Mates. New York: Oxford University Press, 1996.
- The original presentation of the Problem of the Criterion.
Sosa, E. A Virtue Epistemology: Apt Belief and Reflective Knowledge, Volume I. New York: Oxford University Press, 2007.
- Argues that the Cartesian Circle is a version of the Problem of the Criterion.
Sosa, E. Reflective Knowledge: Apt Belief and Reflective Knowledge, Volume II. New York: Oxford University Press, 2009.
- Develops Sosa’s response to the Problem of the Criterion. Argues that the problem of easy knowledge is a version of the Problem of the Criterion.
Van Cleve, J. “Foundationalism, Epistemic Principles, and the Cartesian Circle.” The Philosophical Review 88 (1979): 55-91.
- Argues that the Cartesian Circle is simply a special case of the Problem of the Criterion.
Van Cleve, J. “Reliability, Justification, and the Problem of Induction.” Midwest Studies in Philosophy 9 (1984): 555-67.
- Presents an inductive argument in support of induction and argues that the rule-circularity involved in such an argument is not vicious.
Van Cleve, J. “Sosa on Easy Knowledge and the Problem of the Criterion.” Philosophical Studies 153 (2011): 19-28.
- Discusses Sosa’s response to the Problem of the Criterion and the related, according to Sosa, problem of easy knowledge.
Vogel, J. “Epistemic Bootstrapping.” Journal of Philosophy 105 (2008): 518-39.
- Argues that many forms of epistemic circularity are viciously circular.
Weintraub, R. “What Was Hume’s Contribution to the Problem of Induction?” Philosophical Quarterly 45 (1995): 460-70.
- Argues that the problem of induction is simply a special case of the Problem of the Criterion.

Author Information

Kevin McCain
Email: mccain@uab.edu
University of Alabama at Birmingham
U. S. A.

The Ethics of Economic Sanctions

Table of Contents

1. The Nature of Economic Sanctions

a. Definition

b. Objectives

i. Achievement of Foreign Policy Goals

ii. International Law Enforcement

c. Mechanisms

i. Economic Pressure

ii. Non-Economic Pressure

iii. Direct Denial of Resources

iv. Message Sending

v. Punitive Mechanisms

d. Summary

2. The Ethics of Economic Sanctions

a. Just War Theory

i. Objections to the Use of Just War theory: Christiansen and Powers

ii. Further Objections to the Use of Just War Theory

b. Theories of Law Enforcement

c. Utilitarianism

d. “Clean Hands”

e. Summary

3. References and Further Reading

a. On the Nature of Economic Sanctions

b. On the Ethics of Economic Sanctions

c. Other Referenced Works

Author Information

Presocratics

Table of Contents

1. On “Presocratic” and the Sources

a. The Sources

2. The Milesians

a. Thales

b. Anaximander

c. Anaximenes

3. Xenophanes

4. Pythagoras and Pythagoreanism

5. Heraclitus

6. Eleatic Philosophy

a. Parmenides

i. The Path of Being

ii. The Path of Opinion

b. Zeno

i. Arguments against Plurality

ii. Dichotomy

iii. Infinite Divisibility and Arguments against Motion

c. Melissus

7. Philosophies of Mixture

a. Anaxagoras

b. Empedocles

i. Macrocosm

ii. Microcosm

8. The Atomists

a. Ontology

b. Perception and Epistemology

c. Ethics

9. Diogenes of Apollonia

10. The Sophists and Anonymous Sophistic Texts

a. Protagoras

b. Gorgias

c. Antiphon

d. Prodicus

e. Anonymous Texts

11. Conclusion

12. References and Further Reading

a. Primary Sources

b. Secondary Sources

Author Information

Armed Humanitarian Intervention

Table of Contents

1. What is a Humanitarian Intervention?

2. The Threshold Condition for Intervention

3. Justifying Intervention: Just War Theory

a. Justifying the Recourse to War (jus ad bellum) and Interventions

i. Just Cause

ii. Right Intention and Right Authority

iii. Likelihood of Success, Last Resort, and Proportionality

b. Justifying Conduct in War (jus in bellum) and Justice after War (jus post bellum)

c. Some Implications of Justifying Humanitarian Intervention

4. Other Issues and Challenges